FEBRUARY 
1 
VOLUME 22 
NUMBER 1. 


‘ 
4 he 
ig ‘a ¥ 
3 
4 
i! Ng 
4\ 
: iy 


ge 


Paty 


Journal OF ConsuLTING PSYCHOLOGY 
| College, Columbia University, 


q 
t, 


Orr, Assistant Managing Réitor 


Bardin, Departament of Paychology, University of Michigas, Ann Arbor, Michigan. 
1, 1958, to the Editor, Laurance F. Shaffer, 525 West 120th St., New York 


advertising, gal other business matters, should be sent to the American 
1389 St. N.W., Washington 6, D.C. Address 

Eh the subscription office by the 10th of the month to take effect 


the: Amatican Peychological Association, / 
the American jcal Association, Tne. q 


ut 
2 
4 


Rater’ » Unisersity of California. 
of Miuceticn, Wor . Luckey, B Geveland, Obie. Fred 
McKinney, Usieeraity of Missouri, Anae Roe, New York, N. Y. Cari Rogers, Unsersity of Wisconsin. 
Nevitt Sanford, Duiversity of Edwart Joseph Saoben, Teachers: College, Columina 
George J. Wiechnes, of Pir sour gh. JR. Wittenbora, Retgers University. 
Tht Jounal of Counseling Prychilegy is the clinical journal of the American Psychologicai 
bs Americen Documentid: Pasties, To save printing 
i in the style described in the Palllesiion Manuel of the costs, i to articies may be de- 
| American. Prychelegignl Association (1957 Revision), posited with To obtain copies, order by Docu- 
for $1.00 fren: the Washington. office of ment Number im the reference footnote, from 
APAL ADI Auxiliary’ Project, Photoduplication 
Service, Library of Congres, Washington 25, D. C. 
the early The indicated fee noust be remitted in advance. Make 
acceptable the fret issue checks payable to Chief, Photoduplication Service, 
90 to efter ite Sy ertanging for Library of Congress. 
thee article ‘Sy; printed os enti pages, incrensing ints. Authors receive fifty reprints gratis, ex- 
She of pages that receive. Informe- for carly publication articles-and Brief Reports. 
about charges for publication may be ob- /-‘itional quantities may be ordered whea returning 
Psychologica! 
must reach 
Undelivered 
WHI not be replaced; subscribe hould notify the post 
that they will guarentee second-class forwarding postage. Other claims for undelivered 
be made'withie four months of publication. 


JQURNAL OF 
CO: SULTING 
OGY 


February, 1958 . Vol. 22, No. 1 
Contents 
Factors in Length of Stay and Progress in Vipebieeneanade Patrick L. Sullivan, Christine 
Miller, and William Smelser - - - 


Analysis of Self and Peer Personality Ratings of Psychotherapist and =~ with 
Patient Ratings: Eli A. Rubinstein - - - - 


A Scale for Personality Rigidity: John M. Rehfisch - - - - - - - - = += += = = 


Validity of the Marsh-Hilliard-Liechti MMPI Sexual Deviation Scale in a State Prison 
Population: John B.Wattron - - - - - - - - = 


The Use of Multivariate Statistical Analysis of Minnesota Multiphasic Personality In- 
ventory Scores in the Classification of Delinquent and Nondelinquent High School 
Boys: Peter P. Rempel - - - - - - = = = = = = = = = = = = = = = 


Progressive Matrices (1938) and Emotional Disturbance: Sidney Kasper - - - - - - 
The Polarity of Psychological Tests: Marshali B. Jones - - - - - - - - - = = = 


An Experimental Group Version for School Children of the Progressive Matrices: Read D. 
Tuddenham, Louis Davis, Leslie Davison, and Richard Schindler- - - - - - - - 


Test Anxiety Level and Goal-Setting Behavior: E. Phillip Trapp and Donald H. Kausler - 


Further Normative Data on the Progressive Matrices: Gerald Sperrazzo and Walter L. 


The Achievement Motive and Field Independence : Jack Wertheim and Sarnoff A. Mednick - 
Assimilation, Failure-Avoidance, and Anxiety: David E. Hunt and Harold M. Schroder - - 
Q-Sort Correlations: Stability and Random Choice of Statements: Arnold H. Hilden - - 


Manifest Anxiety in Prisoners Before and After CO,: Dell Lebo, Robert A. Toal, and atid 


Bender-Gestalt Test Correlates of Emotional Depression: John E. Tucker and Mimi J. 


Assaultiveness and Two La of Rorschach Color ne: Robert Sommer and —— 
Twente Sommer- - - = ae 


A Normative Note on Sentence Completion Cross-Se.. . .atification Responses: Francis W. 


Some Determinants of the Perception of Hostility: Bernard I. Murstein - - - - - - 


Social Desirability as a Variable in the Edwards Personal Preference Schedule: Norman L. 
Corah, Marvin J. Feldman, Ira S. Cohen, Walter Gruen, Arnold Meadow, and — A. 
Ringwall- - - - - - = = = = = = = = 

A Factorial Isolation of Psychiatric Outpatient Syndromes: M a Helen Tatom - - - - 

Psychological Test Reviews - - - - 


17 
25 
38 
39 
45 
| 
70 
73 
82 


The Journal of Consulting Psychology will 
accept Brief Reports of research studies in 
clinical psychology for early publication with- 
out expense to the author. The procedure is 
intended to permit the publication of soundly 
designed studies of specialized interest or lim- 
ited importance which cannot now be ac- 
cepted because of lack of space. Several pages 
in each issue will be devoted to Brief Reports, 
published in the order of their receipt with- 
out respect to the dates of receipt of the regu- 
lar articles. Most Brief Reports appear in the 
first or second issue to go to press following 
their final acceptance. 


An author who wishes to submit a Brief 
Report: 


1. Sends the Brief Report, limited to one printed 
page and prepared according to the specifications 
given below. 

2. Also sends to the Editor a full report of the re- 
search study, in sufficient detail to give a clear ac- 
count of its background, procedure, results, and con- 
clusions, which will be filed with the American 
Documentation Institute to insure indefinite avail- 
ability. 

3. Prepares at least 100 mimeographed copies of 
the full report, which the author will send Without 
charge to all who request it as long as the supply 
lasts. 


4. Agrees not to submit the full report to another 
journal of general circulation. 


Specifications 

Brief Report. The Brief Report should give 
a clear, condensed summary of the procedure 
of the study and as full an account of the re- 
sults as space permits. : 

To insure that the Brief Report will be no 
longer than one printed page, its typescript, 
including all matter except the title and the 
author’s lines, must not exceed 75 lines av- 


Brief Reports 


eraging 42 characters and spaces in length. 
Set the typewriter margins for short lines of 
42 characters, which are 3.5 inches long in 
elite typing, and 4.2 inches long in pica. 

The manuscript of the Brief Report must 
be double spaced throughout. Except for its 
short lines, it follows the standard style (1). 
Headings, tables, and references are avoided 
or, if essential, must be counted in the 75 
lines. Each Brief Report must be accom- 
panied by a footnote in the style below, 
which is typed on a separate sheet and not 
counted in the 75-line quota: * 

1An extended report of this study may be ob- 
tained without charge from John Doe, 300 Market 
St., Prospect 6, Mass. (giving the author’s full name 
and address), or for a fee from the American Docu- 
mentation Institute. Order Document No. ——, re- 
mitting $—— for microfilm or $—— for photo- 
copies. 

Extended report. Because the extended re- 
port is intended for photoduplication, and is 
not copy to be sert to a printer, its style 
should differ in several ways from that of 
other manuscripts: (a) The extended report 
should be typed with single spacing for 
economy in duplication. (6) Tables and fig- 
ures should be placed adjacent to the text 
which refers to them. A caption should be 
typed below each figure. (c) Footnotes should 
be typed at the bottom of the page on which 
reference is made to them. In other respects, 
the full report is prepared in thie style speci- 
fied by the Publication Manual (1). 


Reference 


1. American Psychological Association. Council of 
Editors. Publication manual of the American 
Psychological Association (1957 rev.). Wash- 
ington, D. C.: American Psychological Asso- 
ciation, 1957. 


Journal of Consulting Psychology 
Vol. 22, No. 1, 1958 


Factors in Length of Stay and Progress 
in Psychotherapy* 


Patrick L. Sullivan, 
Edward Glaser & Associates, San Francisco 


Christine Miller, 


VA Mental Hygiene Clinic, Oakland 


and William Smelser 
VA Hospital, Oakland 


Research in the area of prediction of out- 
come of psychotherapy has produced mainly 
contradictory findings and at the best only 
low degrees of predictive accuracy for any one 
variable. The reason probably lies in the fact 
that a wealth of factors may affect outcome 
in therapy. For the sake of convenience, these 
may be roughly classified into three groups: 
(a) characteristics of the patient; (5) char- 
acteristics of the therapist; (c) situational 
variables, such as clinic policies as to length 
of treatment, eligibility requirements, and 
external factors in the patient’s life. 

Because so many factors may be important 
in determining outcome of therapy in each 
one of these three groups, it seems unreason- 
able to expect more than a low degre of pre- 
dictability for any one factor which may be 
selected for study. Also, when one cross-vali- 
dates the findings of a previous investigator 
or changes the setting, factors of relevance to 
the outcome of therapy are almost certain 
to change in each of the three categories. 
Thus we find one investigator, Gallagher (5), 
who reported that high scores on Depression 
on the MMPI are associated with “success” 
in psychotherapy, while Barron (2) found 
that high Depression scores are associated 
with lack of improvement in psychotherapy. 
The possibilities for variation in factors of 
relevance to outcome of therapy are so nu- 
merous that the difficulties in generalizing 


1 Approved for publication by Veterans Adminis- 
tration Regional Office, San Francisco. 


from one setting to another seem almost in- 
surmountable. 

These thoughts have been generated from 
our experience in a study of factors related 
to length of stay and progress in treatment. 
After having several hypotheses confirmed in 
one sample, we undertook to cross-validate 
our findings with two other samples in the 
same setting. The inconsistency of most of 
our findings from one sample to another is of 
perhaps as much interest as a few of our find- 
ings which were consistent. 

The purposes of this study were threefold: 

1. We wished to evaluate the relationship 
of social status variables to length of stay and 
progress in psychotherapy in a VA mental 
hygiene clinic. Previous investigators (1, 8, 
9, 10) had found that high social status was 
associated with favorable outcome in therapy, 
and we wished to ascertain the extent to which 
this relationship would hold in our clinic. 

2. We wished to explore some of the psy- 
chological variables which might be associ- 
ated with social status in the hope of gaining 
a better understanding of what makes social 
status a predictor of outcome in psycho- 
therapy. Special scales from the MMPI were 
used in this endeavor, and these will be de- 
scribed below. We were also interested in any 
psychological differences between groups hav- 
ing favorable and unfavorable outcomes in 
therapy which would come to light on the 
MMPI. 


\ — 


Patrick L. Sullivan, Christine Miller, and William Smelser 


3. We wished to examine available data on 
therapists in relation to length of stay and 
improvement. 


Method 


The data for the present study were gath- 
ered from the records of patients in the Vet- 
erans Administration Mental Hygiene Clinic, 
Oakland, California. Only the records of male, 
white veterans of either World War II or the 
Korean action were included. Data from pa- 
tients who terminated treatment by reason of 
legal ineligibility were not included. Our total 
sample consisted of 268 patients, of whom 
196 had completed the Minnesota Multiphasic 
Personality Inventory at the time of initial 
contact with the clinic. 

Data concerning the psychodiagnostic cate- 
gories of the patients are not presented. The 
diagnoses were examined by the investigators, 
and it was concluded that they were too un- 
reliable and too contaminated with legalistic 
procedures to provide a sound basis for group- 
ing. It may be noted, however, that only ap- 
proximately 25% of the patients were desig- 
nated as schizophrenic, so that our findings 
apply predominantly to psychoneurotic and 
psychosomatic cases. 

The study was begun by gathering the rec- 
ords of cases closed in 1952-53, which num- 
bered 131 (Group A). These data were ana- 
lyzed and the findings checked against a 
sample drawn in a similar manner from the 
1951 closings. It became apparent that an in- 
sufficient number of cases was available for 
the purpose (NV is 43 for this second group, 
hereafter designated Group B), so an addi- 
tional sample was drawn from 1954-55 clos- 
ings. This last sample consisted of 94 cases 
and will be called Group C. 

The variables chosen for investigation may 
be divided into four groups as follows: 


A. Demographic measures: 

a. Age. 

b. Education. The number of years of schooling 
completed was translated into a seven-point scale in 
accordance with Warner’s Code (11). In this code 
high school graduates are coded three, college under- 
graduates two, and college graduates one. 

c. Occupation. Vocations were also transformed 
into scale scores following the above scheme, and 
again the relationship is an inverse one. 

d. Education plus Occupation. The scaled scores of 
the two single measures were added together to ap- 


proach a more comprehensive measure of socioeco- 
nomic status. 

e. Education minus Occupation. The difference in 
coded values for Education and Occupation provided 
a measure of the discrepancy between education and 
vocation. 


B. Psychological test measures (N = 196): 
a. MMPI clinical and validating scales. 
b. Special MMPI scales: 

1. Social Status (developed by Gough [6] and 
found by him to be an accurate reflection of actual 
social status). 

2. Intellectual Efficiency (developed by Gough 
{7]. This was used as our only available measure of 
intellectual functioning. Scores on this scale have 
been shown to be significantly related to high school 
and college grades). 

3. Ego Strength (derived by Barron [3] by se- 
lection of items which differentiated a group of pa- 
tients who improved in psychotherapy from those 
who did not improve). 

4. Factor A (a measure of anxiety and general 
maladjustment developed by Welsh [12]). 

5. Factor R (a measure of both the ability and 
the tendency to repress developed by Welsh [12]). 


C. Therapist measures: 

a. Sex. 

b. Professional discipline (psychiatrist, psycholo- 
gist, social worker). 

c. Experience. An assignment of the therapists was 
made to “Experienced” if there was at least one year 
of staff work; otherwise, they were designated “In- 
experienced.” 


D. Outcome measures: 


a. Length of stay. The number of individual ther- 
apy interviews was used. In the vast majority of 
cases, interviews were held weekly, and that fre- 
quency may be taken as the definition of intensity 
of treatment for this study. The cases were split at 
the median number of interviews and designated 
Stay and Not-stay. 

b. Improvement ratings. Ratings on a five-point 
scale were made by the individual therapists after 
termination. One point indicated “worse,” and the 
other four points ranged from “no change” (two) 
to “much improved” (five). Only the cases which 
were above the median number of interviews, i.e., 
the Stay group, were included in the Improved and 
Not-improved data analysis. Patients who received 
the lowest three points on the rating scale were cate- 
gorized as “Unimproved” and the others as “Im- 
proved.” It may be noted that these ratings, depend- 
ing as they do wholly upon the judgment of the 
treating therapist, are of ‘unknown reliability and 
validity. Generalizations from the results must be 
regarded as exploratory only. In a sense, they may 
be looked upon as a measure of therapists’ habits in 
rating improvement. The number of interviews, on 
the other hand, is a more objective measure. 


Factors in Length of Stay and Progress in Psychotherapy 3 


Results 
Length of Stay 


Table 1 contains the results for the 131 
cases constituting Group A, divided into Stay 
and Not-stay subgroups by dichotomizing at 
nine interviews, the median. Many measures 
significantly differentiate the two subgroups, 
with Education emerging as the most effec- 
tive single variable. The results of both the 
demographic and psychological measures show 
that the patients who were better integrated 
and more successful in life pursuits were the 
ones who remained in treatment longer. The 
Stay group is higher on Education and Oc- 
cupation. The MMPI findings show the Stay- 
ers to be higher on JE, Social Status, Ego 
Strength, and on R and K. Both of the latter 
may be regarded as reflecting the ability to 
hold together defenses effectively. Pa, on the 
other hand, is higher in the Not-stay group, 
which, along with the A measure, indicates 
greater maladjustment in that group. In the 


main, a general kind of psychological health 
appeared to be associated with the higher so- 
cial status characteristics of the Stay group. 

In Group A, most of our expectations with 
regard to psychological differences on the 
MMPI between Stayers and Not-stayers were 
confirmed. In contrast with Group A, the 
findings on psychological variables in Groups 
B and C shift markedly and are not consist- 
ent with those of Group A. 

In Table 2 are given the means, CR’s and 
t’s where significant, for Groups B and C. 
Nonsignificant differences are not shown. The 
median number of interviews for Group B 
was identical with that obtained for Group A, 
i.e., nine, but was only five for Group C. No 
cause for this drop has been discovered, for 
there were no changes in the clinic’s treat- 
ment policy nor any important alterations in 
the eligibility requirements. 

Since the inconsistencies between groups 
might have followed from their being dichoto- 


Table 1 
Mean Differences Between Group A Stay and Non-stay Subgroups 


Variable 


Non-stay 


Demographic 
Age 
Educational rating (Ed) 
Occupational rating (Occ) 
Ed plus Occ 
Ed minus Occ 


MMPI special scales 
Social status 
Ego strength 
Intellectual efficiency 
A 
R 


MMPI regular scales 


Stay 

Mean SD Mean SD CR p 

30.7 5.9 32.9 6.9 2.0 05 
292 14 3.72 3.68 001 

433 14 482 12 2.23 02 
22 855 2.1 3.15 001 

-144 14 -106 13 1.45 10 

20.1 4.1 19.0 4.2 1.31 10 

415 12.5 38.5 7.3 1.47 07 

40.3 7.1 36.6 8.2 2.39 01 | 
19.1 7.8 22.3 8.4 1.75 04 
18.3 4.0 16.5 5.8 1.80 04 

L 47.9 8.1 48.1 48 ns 

F 539.7 10.2 0.9 109 ns 

K 524 103 48.1 8.9 2.56 Ot 

Hs 766 158 75.7 18.2 ns 

D 79.5 154 82.1 18.1 ns 

Hy 72.7 8.9 730 13.0 ns 

Pd 6701.5 664 14.1 ns 

Mf 628 14.1 04 160 ns 

Pa 56.7 10.6 617 13.5 2.35 02 

Pt 72.1 15.5 73.30 18.7 ns 

Sc 68.2 16.1 0.5 17.1 ns 

Ma 55.0 9.2 540 12.0 ns 

‘ 


Patrick L. Sullivan, Christine Miller, and William Smelser 


Table 2 
Cross-Validation of Differences Between Stay and Non-stay Subgroups in Groups B and C 


Stay 
Group and 
Variable 


Non-stay 
Mean SD 


Group B 
Education 
Occupation 
Ed plus Occ 
Ed minus Occ 
Pt 
Se 


Group C 
Ego strength 36.8 
Hs 79.3 


D 85.2 


10.3 
18.8 
15.8 


3.7 

4.67 

8.38 
—.97 


70.7 
66.3 


41.0 6.2 
71.3 11.5 
76.4 18.3 


*® SDs not given because data, unfortunately, were discarded after ¢’s had been computed. 


mized at different points, it was decided, 
therefore, to re-examine the data for Group C, 
making the division into subgroups not at five 
but nine interviews. The results of this revised 
alignment of cases are given in Table 3. The 
greatest changes that result from this shift 
are in the means of the demographic variables. 
These results are much more similar to those 
of Group A, but none of them reaches signifi- 
cance. Among the MMPI scales, only minor 
differences in means between the original and 
revised subgroups of Group C are effected. 
Because none of our samples was very 
large, the data for Groups A, B, and C were 
combined to give a larger N, using the re- 
vised assignment of cases for Group C. The 
results are shown in Table 4. There are no 


Table 3 


Differences Between Stay and Non-stay in Group C 
(Revised alignment) 


Stay Non-stay 
Demographic 


Variables Mean SD 


33.3 8.06 34.1 


Age 

Educational 
rating 

Occupational 
rating 

Ed plus Occ 

Ed minus Occ 


2.86 1.15 3.09 
4.30 
7.36 


— 1.46 


2.47 
3.05 
1.40 


4.97 
8.05 
— 1.87 


Mean SD 


Note.—All differences are nonsignificant. 


significant differences between the Stay and 
Not-stay groups on the MMPI scales; in fact, 
the profiles of the Stay and Not-stay group 
are strikingly similar. This finding was to be 
expected because of the few significant dif- 
ferences within the individual samples and the 
frequent reversals of direction. Differences in 
demographic measures, on the other hand, are 
all significant except Education minus Occu- 
pation, Education being the most differentiat- 
ing among them. 

While we confirmed the findings of other 
investigators that social status variables are 
relevant to length of stay, we failed to identify 
any psychological variables which consistently 
differentiate the Stayers from Not-stayers. 


Therapeutic Improvement 


It will be recalled that the analysis of data 
for factors related to progress in treatment 
was limited to those cases included in the 
Stay category. Also, the division into Im- 
proved and Unimproved was effected by sub- 
suming under the former those cases rated in 
the highest two of the five points on the rat- 
ing scale. 

The handling of the data paralleled that 
described for the length-of-stay material. Be- 
cause the Ns in groups A, B, and C taken 
separately are so small, only the combined 
data for an N of 98 for demographic vari- 
ables and 83 for the MMPI variables are pre- 


4 
2.85 .93 1.12 2.65 02 
4.24 1.74 1.36 ns 
7.08 2.62 1.90 1.81 10 
—1.39 1.18 1.34 ns 
78.8 2.1 05 
75.4 | 1.88 10 
4 
2.57 .02 
256 02 
867.68 
1.27 
1.84 
2.80 
| 1.46 


sented. It should be noted, however, that com- 
pared with the frequent reversals from year to 
year for the Stay and Not-stay groups, there 
was much more consistency from sample to 
sample on the MMPI variables. Table 5 pre- 
sents the findings for the total group of 
Improved compared with Unimproved. In 
marked contrast to the finding that the Stay 
and Not-stay groups were almost identical in 
MMPI profiles, nearly all the clinical scales 
show significant differences between the Im- 
provers and Unimprovers. The Improvers have 
generally lower profiles than the Unimprovers, 
suggesting as Barron has said that those who 
improve in psychotherapy are less sick in the 
first place (2). 

Turning to the demographic variables, the 
means of the Improved patients are seen to 
have contributed disproportionately to the 
differences found between the Stay and Not- 
stay groups. That is, the Improved show the 
Stay tendencies to a greater degree than do 


Table 4 
Total Group (A, B, and C), Differences Between Stay and Non-stay 


Factors in Length of Stay and Progress in Psychotherapy 5 


the Unimproved. The mean of Education for 
the Stay group contained in Table 4 is 2.84, 
for example, while the Improved group mean 
in Table 5 is 2.67; the Improved are better 
educated than the Stayers as a whole. 

As the psychological findings indicated that 
the Improved were less disturbed at t! > t- 
set of therapy than the Unimproved, so do 
the findings on the demographic variables 
show that they also held better positions and 
were employed in capacities more closely in 
keeping with their higher educational achieve- 
ment. 

Turning to other findings not covered in 
the foregoing tables, it might first be noted 
that there is a highly significant difference 
between the means of the number of inter- 
views of the Improved and Unimproved. The 
means are 31.9 and 21.3, respectively, which, 
with standard deviations of 24.1 and 15.9, 
yield a CR of 2.5 (p < .01). In view of the 
manner in which our progress ratings were 


Variable 


Non-stay 


Mean SD 


Demographic 
Age 31.4 5.8 
Educational rating (Ed) 2.84 1.19 
Occupational rating (Occ) 4.27 1.87 
Ed plus Occ 7.20 2.69 
Ed minus Occ —1.43 1.35 
MMPI special scales 
Socia! status 20.0 
Ego strength 39.0 
Intellectual efficiency 39.0 
A 20.1 


R 
MMPI regular scales 


47.8 
F 61.3 
K 51.2 
Hs 76.3 
D 82.0 
Hy 73.8 
Pd 67.5 
Mf 63.7 
Pa 58.9 
Pi 73.9 
Sc 71.6 
Ma 56.9 


33.2 7.48 2.19 05 
3.42 1.26 3.74 001 
4.90 1.82 274 1 
8.30 2.43 3.52 001 

—1.46 1.48 ns 

19.4 ns 

39.1 ns 

38.4 ns 

20.6 ns 


17.0 


47.7 ns 
60.8 ns 
49.1 ns 
74.1 ns 
80.5 ns 
73.0 ns 
68.2 ns 
62.5 ns 
61.7 ns 
72.2 ns 
69.6 ns 
57.3 ns 


Stay 
Mean SD CR 


Patrick L. Sullivan, Christine Miller, and William Smelser 


Table 5 
Total Group (A, B, and C), Differences Between Improved and Unimproved Cases 


Improved 


Unimproved 


Mean 


Demographic 


Age 

Educational rating 
Occupational rating 
Ed plus Occ 

Ed minus Occ 


MMPI special scales 
Social status 
Ego strength 
Intellectual efficiency 
A 
R 


MMPI regular scales 


31.5 
2.97 
4.65 
7.60 

— 1.68 


19.5 
38.5 
38.4 
21.8 
17.9 


46.8 
64.8 
50.1 


obtained—i.e., from the treating therapist 
only—the determination of whether this find- 
ing is a “therapist” or “patient” phenomenon 
can only be left to speculation. 

The usefulness of single measures for indi- 
vidual prediction was closely examined. None 
of them provide an adequate basis for such a 
purpose, although Education approached sig- 
nificance for length-of-stay predictions. If the 
large group of high school graduates is elimi- 
nated, the groups with less than high school 
versus those with more, yield highly signifi- 
cant differences with regard to length of stay. 


Therapist Variables 


It was indicated earlier that measures of 
therapist variables were also studied. None of 
the results of this analysis of therapist vari- 
ables was significant. Therapists’ sex and pro- 
fessional training showed only chance rela- 
tionships with length-of-stay and progress 
ratings. Experienced therapists showed a sys- 


tematic but nonsignificant inclination to (a) 
keep patients in treatment longer and (d) 
rate marked improvement more frequently. 
It was also found, though again not con- 
sistently through the various samples, that 
more experienced therapists were assigned pa- 
tients with better prognosis, i.e., with higher 
educational and occupational standing. In or- 
der to learn whether the superior outcomes 
achieved by experienced therapists was a func- 
tion of experience or of better patient prog- 
nosis, the following analysis was made: Pa- 
tients were divided into High and Low Edu- 
cation groups. Chi squares were computed by 
comparing, first, Stay, Not-stay proportions 
between Experienced and Inexperienced thera- 
pists within the High Education group; then 
the same comparison was made for the Low 
Education group. A similar analysis was made 
for the Improved and Unimproved groupings. 
It was found that the Experienced Therapist— 
High Education combination showed a higher 


6 
Variable Mean SD | SD CR p 
31.0 5.8 64 ns 
2.67 1.31 1.36 ns 
3.89 1.33 1.30 2.84 01 
6.56 1.70 2.46 1.97 10 
—1.18 0.98 94 2.57 O01 
20.2 ns 
40.1 ns 
18.0 6.6 2.69 O1 
18.2 ns 
L 49.0 ns 
PA F 57.0 84 a 7.5 4.82 .0001 
; K 52.8 ns 
Hs 74.1 7.5 78.1 8.3 25 02 
D 77.2 11.5 85.7 12.2 3.54 001 
Hy 704 5.1 764 5.4 5.65 .0001 
Pd 64.8 9.3 69.3 8.6 2.46 02 
Mf 63.3 63.9 ns 
Pa 56.7 60.6 ns 
Pt 70.4 10.7 76.9 11.8 2.84 O1 
Se 67.4 11.4 75.3 13.6 3.11 001 
Ma 54.7 7A 58.6 7.1 2.70 O1 


Factors in Length of Stay and Progress in Psychotherapy 7 


probability for better outcome. The results 

were, however, far short of statistical signifi- 

cance. (p equals approximately .30 for both 

length-of-stay and progress.) The Low Edu- 

cation group had equally good or bad out- 

come regardless of therapist experience. 
Discussion 

The failure of the findings concerning length 
of stay of Group A to be substantiated in the 
other samples suggests several lines of thought. 
The most obvious implication to be drawn is 
that findings in this research area must be 
viewed with considerable tentativeness until 
thoroughly validated. This is particularly true 
of psychological test findings, as Windle has 
pointed out in his excellent review of the 
literature (13). 

We might also take cognizance of other 
ways in which this study fails to meet stand- 
ards of a comprehensive and rigorous investi- 
gation. The lack of objectivity in the thera- 
peutic outcome ratings has already been 
noted, as has the fact that adequate diag- 
noses were not available. The amount of diag- 
nostic homogeneity of our patient population 
can only be speculated on, and the differences 
in our measures among diagnostic categories 
is unknown. It has not been possible to de- 
scribe the variations in compeience and ap- 
proach among therapists, nor is there infor- 
mation about such important items as date of 
onset of initial symptoms. Finally, our rat- 
ings do not encompass a follow-up period at 
all. 

Despite these limitations, our results corro- 
borate previous studies and add further sub- 
stantiation to a picture which is evolving with 
ever-increasing clarity. The core of this is that 
those persons who are least equipped to meet 
life challenges are the ones who stand to gain 
least from psychotherapy. 

Aside from considerations of the situation 
and how it might be selectively contributing 
to the results, our findings highlight the dif- 
ference between “ability to use” and “need 
for” psychotherapy. These are not new con- 
cepts. They are obviously implicit in the cri- 
teria operating, for example, in the selection 
of patients for psychoanalysis during the trial 
analytic phase. In other treatment contexts, 
however, such as the one in which this study 


was conducted, the importance of the selec- 
tion of “good” patients is sometimes ob- 
scured. This has its commendable aspects, in 
that many patients who might be turned away 
from other treatment facilities are accepted 
and efforts made to rehabilitate them. It is 
the personal. experience of the writers that 
such efforts expand considerably the horizons 
of the professional worker and force the evo- 
lution of thinking that might never be pro- 
duced were the selection of patients more 
rigorous. There are, furthermore, salutary ef- 
fects achieved in a minority of cases. The 
worth of maintaining persons on jobs, in their 
own homes, and preventing them from be- 
coming wards of hospitals cannot be gainsaid. 

The fact remains that these two concepts 
are inversely related in our population, and 
alleviation is not being effected in many of 
those cases where it is most pressingly needed. 
Quite contrary to the requirements and, im- 
plicity at least, the expectations of society, 
the patients most in “need” are being helped 
minimally if at all. 

If these persons are to be extended aid 
which may realistically be expected to be of 
benefit to them, it would appear that quite 
different approaches to their treatment will 
have to be evolved. In the development of 
these new attacks on the problem, the prin- 
cipal finding of this and other research (8, 9, 
10) provide adequate starting points. In par- 
ticular, if psychotherapy is characterizable as 
a process which places great importance on 
the ability of the patient to function at a high 
level of abstraction, the ability of the poten- 
tial patient to engage in this process will have 
to be more precisely determined. The finding 
with respect to Education supports the infer- 
ence that those with more schooling are both 
more intelligent and more capable of making 
abstractions about themselves. The research 
of Eells et al. (4) into the relatively lower 
abstractional ability of lower socioeconomic 
groups is pertinent to ours and to the find- 
ings of the Yale group (9). An approach such 
as the foregoing would be aimed at identify- 
ing those patients who are capable of entering 
into treatment as it is presently conducted by 
the majority of professional persons. 

What are the implications of these findings 
for patients who do not demonstrate the abil- 


8 Patrick L. Sullivan, Christine Miller, and William Smelser 


ity to function at a high level of abstraction? 
The authors do not pretend to have compre- 
hensive answers to this question but can at 
least extend suggestions. For one thing, it 
would be advisable to structure treatment at 
the outset for these patients in such a way as 
to concretize it. Specific time limitations on 
the length of treatment could be discussed at 
the time of intake and definite problem areas 
would be mutually agreed on and adhered to. 
Periodic summing-up, focused on the problem 
areas initially delineated, might be introduced. 
The effort to rehabilitate would be steered by 
the therapist with minimal demands on self- 
awareness and would focus instead on more 
concrete alternatives to the patients’ current 
perceptions and behaviors. In short, therapy 
would be tailored to suit the individual’s func- 
tioning ability to deal with gradated levels of 
abstraction. 

Our study also illustrates the difficulties en- 
countered in this area of research. Cross-vali- 
dations in the same setting failed to confirm 
most of the original findings and pointed up 
the difficulties of generalizing results even 
from one year to the next. After considering 
our own difficulties in getting stable findings, 
we propose a few suggestions for further re- 
search in this area, most of which seem ob- 
vious, but which nevertheless have not often 
been adhered to. 

1. Cross-validation of findings is essential 
in this area of research. An unexpectedly large 
proportion of the obtained “significant” dif- 
ferences may vanish on repetition with suc- 
cessive patient groups in the same setting. It 
seems wise, therefore, to regard findings until 
they have been cross-validated as only inter- 
esting, possibly chance findings which need 
to be further tested. 

2. Large samples seem imperative because 
of the impossibility of controlling or even 
specifying many of the innumerable factors 
which can affect outcome in therapy. 

3. As many as possible of the conditions of 
the study—the characteristics of patients and 
therapists—should be specified to aid in the 
evaluation of findings. 

4. The study of groups which are homo- 
geneous with regard to factors related to 


prognosis will enable meaningful differences 


to come to light in the context in which they 
are relevant. 

5. It may be that we will come to a view- 
point prevalent in industrial psychology that 
local norms are necessary for sufficiently ac- 
curate prediction. Preliminary work may help 
to delineate some broad factors of relevance 
to outcome in therapy. It may be that from 
there on our efforts will be best expended in 
developing norms for our particular settings, 
patient populations, and possibly therapists. 


Conclusion 


A study of the factors related to length of 
stay and progress in psychotherapy in a VA 
mental hygiene clinic was carried out. Par- 
ticular attention was given to measures of 
social status and to psychological variables 
which might be associated with social status. 
It was found that while the Stay and Not- 
stay groups did not show any distinguishable 
differences on the psychological variables used 
(MMPI), the demographic variables of Edu- 
cation and Occupational level did differentiate 
the Stayers from the Non-stayers. When pa- 
tients who were rated improved in psycho- 
therapy were compared with those rated un- 
improved, patients with higher occupational 
achievement and less psychopathology proved 
to be more successful in treatment. 


Received April 1, 1957. 


References 


1. Auld, F., Jr., & Myers, J. K. Contributions to a 
theory for selecting psychotherapy patients. J. 
consult. Psychol., 1954, 18, 56-60. 

. Barron, F. Some test correlates of response to 
psychotherapy. J. consult. Psychol., 1953, 17, 
235-241. 

. Barron, F. An ego-strength scale which predicts 
response to psychotherapy. J. consult. Psy- 
chol., 1953, 17, 327-333. 

. Eells, K., Davis, A., Havighurst, R. J., Herrick, 
V. E., & Tyler, R. W. Intelligence and cul- 
tural differences: a study of cultural learning 
and problem solving. Chicago: Univer. of Chi- 
cago Press, 1951. 

. Gallagher, J. J. MMPI changes concomitant with 
client-centered therapy. J. consult. Psychol., 
1953, 17, 334-342. 

. Gough, H. G. A new dimension of status: I. De- 
velopment of a personality scale. Amer. sociol. 
Rev., 1948, 13, 401-409. 

. Gough, H. G. A nonintellectual intelligence test. 
J. consult. Psychol., 1953, 17, 242-246. 


Factors in Length of Stay and Progress in Psychotherapy y 


8. Hollingshead, A. B.,.°& Redlich, F. C. Schizo- 11. Warner, W. L., Meeker, Marchia, & Eells, K. 
phrenia and sociai structure. Amer. J. Psy- Social class in America. Chicago: Science Re- 
chiat., 1954, 110, 695-701. search Ass., 1949. 

9. Redlich, F. C., Hollingshead, A. B., Roberts, 12. Welsh, G. S. Factor dimensions A and R. In 
B. H., Robinson, H. A., Freedman, L. Z., & G. S. Welsh & W. G. Dahlstrom (Eds.), Basic 
Myers, J. K. Social structure and psychiatric readings on the MMPI in psychology and 
disorders. Amer. J. Psychiat., 1953, 109, 729, medicine. Minneapolis: Univer. of Minnesota 
734. Press, 1956. Pp. 264-282. 

10. Roberts, B. H., & Myers, J. K. Religion, na- 13. Windle, C. Psychological tests in psychopatho- 
tional origin, immigration, and mental illness. logical prognosis. Psychol. Bull. 1952, 49, 
Amer. J. Psychiat., 1954, 110, 759-764. 451-482. 


Vai 22, No. 


‘onsulting Psychology 
o. 1, 1958 


Analysis of Self and Peer Personality Ratings of Psy- 
chotherapists and Comparison with Patient Ratings’ 


Eli A. Rubinstein 


Studies attempting to compare factor pat- 
terns of normal and maladjusted persons de- 
rived from personality measurements on the 
same battery of tests have been relatively 
few. Among these studies the usual finding 
has been that the factor structure is similar 
even though the levels of scores differ. 

As part of a larger research on evaluation 
of change during psychotherapy, 100 psycho- 
therapists in ten VA mental hygiene clinics 
anonymously rated themselves on a multi- 
dimensional rating scale previously used with 
a patient sample. At the same time, 100 peers 
made ratings of these same therapists, with 
self raters sometimes acting as peer raters. 
These data lend themselves to further ex- 
amination of the relationship between factor 
patterns of normal individuals versus a pa- 
tient population. 

The major hypothesis tested was that the 
personality patterns of a nonpatient sample 
would not differ from personality patterns 
emerging from analysis of a patient sample. 
The analysis of the ratings was in terms of 
25 items representing six factors previously 
identified and confirmed with two successive 
samples of neuropsychiatric patients. Tetra- 
choric correlations of the 25 items with each 
other were computed for the 100 self ratings 
and the 100 peer ratings. The resulting two 
matrices were each independently clustered 
according to these previous factor patterns 


1An extended report of this study may be ob- 
tained without charge from Eli A. Rubinstein, Neu- 
ropsychiatric Research Laboratory, Veterans Ad- 
ministration, Veterans Benefits Office, Munitions 
Building, Washington 25, D. C., or for a fee from 
the American Documentation Institute. Order Docu- 
ment No. 5432, remitting $1.25 for microfilm or 
$1.25 for photocopies. 


Veterans Administration and Catholic University 


10 


and a multiple group method of factoring was 
employed. For the peer ratings of therapists, 
this a priori clustering did reveal linearly in- 
dependent groups. Rotation to simple struc- 
ture produced essentially the same six factor 
patterns as originally isolated with the pa- 
tient samples. For the self ratings of thera- 
pists, factoring of the a priori clusters broke 
down because of the linear dependence of 
one of the groups. 

As a result of this latter finding, the com- 
parable pairs of correlations in the two mat- 
rices of ratings were inspected. Evidence for 
greater generalization in peer ratings is to be 
found in the fact that only 35 self rating 
correlations are greater than .40, while 74 of 
the peer rating correlations exceed that value. 
Chi square for the two complete sets of cor- 
relations, split at a common median, is 9.63, 
p< 01. 

In further examination of the two matrices, 
20 pairs of correlations were found in which 
the significance of the difference between the 
pair was beyond the .01 level. In all but two 
cases the high correlation was in the matrix 
of peer ratings. In all but three of these 
twenty pairs of correlations, one or both of 
the scales are rated, for the peer ratings, 
mainly on inference. Apparently, there is less 
concomitance of certain manifest and covert 
behavior, as reflected by their self ratings, 
than would be presumed from peer ratings. 

The finding suggests that when comparisons 
of ratings on covert behavior are made be- 
tween self and other ratings, interpretation of 
similarities or differences should be made with 
considerable caution. 

Brief Report. 
Received October 4, 1957. 


Journal of Consulting Psychology 
Vol. 22, No. 1, 1958 


A Scale for Personality Rigidity*’ 


John M. Rehfisch 
Berkeley, California 


This is a report of the construction, by em- 
pirical methodology, of a scale for personality 
rigidity. 

As part of an earlier project (6), a survey 
was made of a number of different types of 
personality studies in order to define the ma- 
jor elements and correlates of the rigidity— 
flexibility dimension. The types of behavior 
referred to by the dimension were found to 
be quite similar among all the studies, and 
the iollowing summary of the qualities most 
commonly included at the rigid pole was 
made: (a) constriction and inhibition, (5) 
conservatism, (c) intolerance of disorder and 
ambiguity, (d) obsessional and perseverative 
tendencies, (e) social introversion, (f) anx- 
iety and guilt. 

This summary may be considered as a com- 
prehensive connotative definition of the term 
personality rigidity, established empirically 

1This paper presents a continuation of a study, 
the first stage of which is contained in a dissertation 
submitted in partial fulfillment of the requirements 
for the Ph.D. degree in the department of psychol- 
ogy, University of California, Berkeley. The author 
is very much indebted to Harrison Gough for his ad- 
vice and encouragement throughout the course of 
this study and for his critical reading of the manu- 
script; and to Donald MacKinnon, Director of the 
Institute of Personality Assessment and Research, 
for many valuable suggestions and for permission to 
use the data and facilities of the Institute. 

2A portion of this research was supported in part 
by the USAF under Contract No. AF 18(600)-8, 
monitored by Technical Director, Detachment No. 7 
(Officer Education Research Laboratory), Air Force 
Personnel and Training Research Center, Maxwell 
Air Force Base, Alabama. Permission is granted for 
reproduction, translation, publication, use, and dis- 
posal in whole and in part by or for the United 
States Government. Personal views or opinions ex- 
pressed or implied in this publication are not to be 
construed as necessarily carrying the official sanction 
of the Department of the Air Force or of the Air 
Research and Development Command. 


according to the criterion of common psy- 
chological usage. The synopsis, though more 
extensive, incorporates all the elements of 
Fisher’s summary description of personality 
rigidity which is the following: “flexibility, 
fluidity, rigidity, perseveration and obsessive- 
compulsiveness . . . have all tended to refer 
to the idea that the behavior of people in 
different situations varies along a continuum 
marked at one extreme by cautious guarded- 
ness or limitation of reaction, and at the other 
extreme by freedom of reaction or lack of de- 
fensive caution” (2, p. 342). 

The degree to which the scale assesses the 
characteristics denoted by the above two syn- 
opses will be considered as an indication of 
its validity as a measure of personality ri- 
gidity. 


Procedure and Results 
Setting 


The raw data used in this study had been 
gathered previously in the course of a num- 
ber of assessment and field-testing programs 
carried on by the University of California’s 
Institute of Personality Assessment and Re- 
search (IPAR) (5). 


Subjects 


The assessed Ss whose responses were used 
in constructing the scale included the follow- 
ing: (a) 80 advanced graduate students (40 
in each assessed sample) from various de- 
partments in the University of California, 
(6) 80 senior medical students (40 in each 
assessed sample) from the University of Cali- 
fornia Medical School, (c) 70 candidates for 
admission to the University of California 
Medical School, and (d) 100 Air Force cap- 
tains. The total N was therefore 330. 


12 John M. 
Criterion Ratings 

As part of the assessment program, each 
assessee was rated by IPAR staff members, 
consisting of professional psychologists and 
graduate research assistants, on a number of 
variables, including that of rigidity. The mean 
rigidity rating for each S served as the cri- 
terion for scale derivation. 

The number of raters varied from sample to 
sample, but there were never fewer than five, 
and the average for all the groups was eight. 

The interrater reliability of the average rat- 
ing, computed by an analysis of variance tech- 
nique (1), ranged, across the samples, from 
.50 to .81, with an overall average of .73. 
Rigidity apparently exists as a sufficiently per- 
ceptible personality trait, so as to be rateable 
with significant interrater agreement. 


Item Selection 


The scale was compiled from a pool of 
957 “true”—“false” personality inventory items 
taken from the MMPI, the California Psy- 
chological Inventory (CPI) (4), and a num- 
ber of items especially constructed for the 
IPAR assessments. The construction of the 
scale involved item analyses in which the in- 
ventory results of the highest and lowest 25 
per cent on rated rigidity were compared and 
items chosen according to the significance 
level of the difference between the propor- 
tions of “true” responses among the highs 
and the lows. 


Cross-validation 


An unbiased validity measure, for a scale 
which has been empirically derived from a 
given sample, can be obtained only from a 
separate independent sample. Such a requisite 
for a just validity check is, however, incom- 
patible with the primary objective of obtain- 
ing items most stably related to the criterion 
dimensioh; for the latter objective requires 
that all the available criterion samples be 
used for item selection. This problem was re- 
solved by constructing two preliminary scales 
from successively larger parts of the total 
sample, and in both cases employing other 
portions of the sample for obtaining the re- 
quired independent validity measures. The ob- 
tained measures were then considered as ap- 


Rehfisch 


proximate estimates of the validity of the final 
scale which was derived from the total cri- 
terion sample. 

The first preliminary scale was based on the 
responses of the highest and lowest 25 per 
cent on rated rigidity from among the 80 
graduate students, the 80 medical students, 
and 40 of the 100 AF captains. A cross-valida- 
tion coefficient for the first scale, computed 
by correlating scale scores against ratings of 
rigidity for the 60 AF captains who had not 
been used in deriving the scale, came to .35 
(two-tailed test: p < .01). This coefficient of 
.35 was considered as indicating that the first 
scale gives an adequate approximation of ri- 
gidity, and that this cross check confirmed the 
validity of the procedures utilized in its der- 
ivation. 

The second scale was a modification of the 
first as determined by the responses of the 
additional 60 AF captains. In the case of this 
second scale, a more extensive cross-validat- 
ing appraisal was obtained by correlating 
scale scores with a large number of traits, as 
indicated by ratings and Q sorts made by 
the IPAR staff. The 70 medical school ap- 
plicants, none of whom had been used in de- 
riving the second scale, were employed for 
this analysis. 

The scale was found to correlate positively 
with the following traits: (a) in interpersonal 
situations tends to be a listener or spectator 
.54; (6) constriction .51; (c) tends toward 
over-control of his needs and impulses .. . 
delays gratification unnecessarily .44; (d) 
manifest anxiety .39; (e) has a readiness to 
feel guilty .36; (f) tends to delay or avoid 
decision .35; (g) does not vary roles; relates 
to everyone in the same way .33; (4) tends 
to ruminate and have obsessive thoughts .30; 
(i) is uncomfortable with uncertainties and 
complexities .30; (j) follows routine in liv- 
ing; is orderly .28; (&) rigidity .19. 

Negative correlations are with: (a) is self- 
indulgent — .51, (0) fluency of ideas — .45, 
(c) verbal fluency — .40, (d) impulsivity 
— .35, (e) originality — .35. 

A comparison of the above findings with 
the previously presented synopses of person- 
ality rigidity reveals an appropriate congru- 
ence between the correlates of the second scale 
and most of the elements contained in or de- 


A Scale for Personality Rigidity 13 


rivable from the synopses. The high scoring 
Ss tend to be rated as constricted, over-con- 
trolled, anxious, socially introverted, guilt 
prone, inflexible in their social roles, compul- 
sively doubtful, obsessive and ruminative, un- 
comfortable with uncertainties and complexi- 
ties, and orderly. Two of the highest corre- 
lates, constriction .51 and over-control .44, 
are, in psychological parlance, practically 
synonymous with the term “personality ri- 
gidity.” 

By contrast, low scoring Ss are likely to be 
judged as outgoing in social situations, fluent 
in thought and speech, impulsive, original, 
and self-indulgent. 

These characteristics of low scorers are 
clearly indicative of flexibility, the bipolar 
opposite of rigidity. 

The low correlation with the rating of ri- 
gidity could be due primarily either to error 
in the scale or to error in the ratings. Since, 
in the present instance, the scale shows sub- 
stantial and appropriate correlations with the 
key elements of the rigidity-flexibility di- 
mension, it seems reasonable to question the 
accuracy and/or comprehensiveness of these 
particular rigidity ratings. 

The likelihood of considerable error in a 
rating situation does not of course disqualify 
ratings as useful criteria in personality re- 
search; but does indicate the desirability of 
using a relatively large sample of ratings, as 
was done in the present study, in order to 
balance out random error variance. 

Further cross-validating data on the second 
scale was obtained from an item analysis of 
the staff-checks on the Gough Adjective 
Check-List (3). As part of the assessment pro- 
gram, staff judges completed check-lists for 
each assessee. Composite staff descriptions 
were then derived by considering each adjec- 
tive which was selected for an S by three or 
more observers as being present, and all others 
as being absent. For the present analysis, 
adjective composites for the 18 highest and 
the 18 lowest scorers on the second rigidity 
scale from among the 70 medical school ap- 
plicants were compared. The following adjec- 
tives were found to differentiate significantly 
(.05 level) between high and low scorers. 


Adjectives more often checked about high scorers: 
anxious, conscientious, conservative, deliberate, de- 


pendent, gentle, inhibited, mild, moderate, modest, 
painstaking, peaceable, quiet, reserved, retiring, seri- 
ous, shy, sincere, submissive, thorough, timid, with- 
drawn, worrying. 

Adjectives more often checked about low scorers: 
active, adaptable, aggressive, argumentative, assertive, 
clear-thinking, confident, curious, demanding, ener- 
getic, independent, irritable, organized, outgoing out- 
spoken, planful, poised, quick, resourceful, sel‘-cen- 
tered, self-confident, self-secking, sociable, spo: tane- 
ous, talkative, versatile. 


The most salient dimension defined by the 
two sets of adjectives seems to be a bipolar 
factor chara.ierized by an anxious restraint, 
and a meticulous conscientiousness, and con- 
servativism at one extreme; as contrasted 
with a vigorous, versatile, self-confident spon- 
taneity of thought and action at the other. 
These two contrasting categories are, in turn, 
very similar to the “cautious guardedness or 
limitation of reaction” vs. the “freedom of re- 
action or lack of defensive caution” which, as 
mentioned earlier, Fisher (2, p. 342) found 
to be the most essential characteristics of the 
rigidity-flexibility dimension. 

The final scale ,to be known as Ri) was 
based on the inventory results of the highest 
and lowest 25 per cent on rated rigidity—83 
highs and 83 lows—from the total criterion 
sample.* Ri is composed of 39 items, 20 sig- 
nificant at the .01 level, 18 significant at the 
.05 level, and 1 significant at the .06 level. 


Scale Items 


The 39 Ri items, and the direction of the 
rigid response, are listed below. The items are 
grouped according to psychological categories, 
subjectively derived from an impressionistic 
content analysis made by the writer. 


Anxiety and constriction in social situations. 1. I 
usually don’t like to talk much unless I am with 
people I know very well. (T) 2. I like to talk before 
groups of people. (F) 3. It is hard for me to start a 
conversation with strangers. (T) 4. I would like to 
be an actor on the stage or in the movies. (F) 5. It 
is hard for me to act natural when I am with new 
people. (T) 6. I feel nervous if I have to meet a lot 
of people. (T) 7. I usually feel nervous and ill at 
ease at a formal dance or party. (T) 8. When I work 
on a committee I like to take charge of things. (F) 


3 The number of Ss per item actually varied some- 
what from item to item, since not all the samples 
took identical inventories. For the 39 Ri items, the 
N (“highs” plus “lows”) per item ranged from 106 
to 166, with a mean of 132. 


14 Jahn M. Rehfisch 


9. I usually take an active part in the entertainment 
at parties. (F) 10. I am a better talker than listener. 
(F) 11. I try to remember good stories to pass them 
on to other people. (F) 12. I am embarrassed with 
people I do not know well. (T) 13. A strong person 
doesn’t show his emotions and feelings. (T) 

Need for a stable, orderly, predictable environment ; 
perseverative tendencies. 14. I must admit that it 
makes me angry when other people interfere with my 
daily activity. (T) 15. I find that a well-ordered 
mode of life with regular hours is congenial to my 
temperament. (T) 16. It bothers me when some- 
thing unexpected interrupts my daily routine. (T) 
17. I don’t like to undertake any project unless I 
have a pretty good idea as to how it will turn out. 
(T) 18. I find it hard to set aside a task that I have 
undertaken, even for a short time. (T) 19. I don’t 
like things to be uncertain and unpredictable. (T) 

Slowness in coming to a decision—compulsive 
doubting. 20. I am very slow in making up my mind. 
(T) 21. At times I feel that I can make up my mind 
with unusually great ease. (F) 

Conservatism and conventionality. 22. I must admit 
I try to see what others think before I take a stand. 
(T) 23. I do not like to see women smoke. (T) 24. I 
would be uncomfortable in anything other than fairly 
conventional dress. (T) 25. I keep out of trouble at 
all costs. (T) 26. It wouldn’t make me nervous if 
any members of my family got into trouble with the 
law. (F) 27. I must admit that I would find it hard 
to have for a close friend a person whose manners 
or appearance made him somewhat repulsive, no 
matter how brilliant or kind he might be. (T) 28. I 
would certainly enjoy beating a crook at his own 
game. (F) 29. I would like the job of a foreign cor- 
respondent for a newspaper. (F) 

Self-doubt and sensitivity to negative criticism. 
30. I get very tense and anxious when I think other 
people are disapproving of me. (T) 31. I am cer- 
tainly lacking in self-confidence. (T) 32. Criticism or 
scolding makes me very uncomfortable. (T) 

Misanthropy and parsimony. 33. Most people in- 
wardly dislike putting themselves out to help other 
people. (T) 34. I am against giving money to beg- 
gars. (T) 35. Many of the girls I knew in college 
went with a fellow only for what they could get out 
of him. (T) 

Emphatic concern with work and study. 36. I al- 
ways follow the rule: business before pleasure. (T) 
37. I get disgusted with myself when I can’t under- 
stand some problem in my field, or when I can’t 
seem to make any progress on a research problem. 
(T) 

Miscellaneous. 38. I have never been made espe- 
cially nervous over trouble that any members of my 
family have gotten into. (T) 39. I have no fear of 
spiders. (T) 


The items suggest that the S rated high on 
rigidity differs, as indicated by self-report, 
from the low-rated S in being socially and 
emotionally constricted; anxious; intolerant 


Table 1 


Summary Statistics for the 39 Item Ri Scale and for 
27 Items of Ri Contained in the CPI 


39 27 
Item Scale Item Scale 


Sample N M SD M SD 


AF captains 343 16.01 5.01 10.65 4.32 


Medical school 


applicants 70 14.60 5.85 913 4.73 


Total 413 15.77 5.21 10.39 4.48 


of disorder, irregularity, and unpredictability ; 
perseverative; slow in making decisions; con- 
servative; conventional; lacking in self-con- 
fidence; misanthropic; and obsessionally in- 
volved in work. 

Since most of these qualities are indicative 
of rigidity, it may be concluded that, among 
the assessed Ss, self-descriptions tend to be 
consistent with the rigidity ratings. 


Reliability of the Scale 


The reliability of Ri was estimated by the 
split-half technique. In order to approximate 
equivalence of halves, an attempt was made 
to match items for the two halves according 
to similarity of content. The corrected reli- 
ability came to .72 in a sample of 60 field- 
tested AF captains. These Ss were not in- 
cluded among the 100 assessed AF captains 
used in deriving the scale. 


Summary Statistics 


Table 1 contains the mean and SD for Ri 
and tor the 27 items of Ri which are con- 
tained in the CPI,* among a sample of 343 
AF captains and a sample of 70 medical 
school applicants. The mean and SD for the 
combined distributions are the most repre- 
sentative norms currently available. 

In conclusion, it should be noted that the 
present applicability of Ri is limited to men 


* The 27 CPI items, as numbered in the CPI book- 
let, and the direction of the rigid response are as 
follows: 13(T), *4(T), 16(T), 26(T), 38(T), 50(T), 
52(F), 58(T), 83(T), 85(T), 109(T), 150(T), 
159(T), 177(T), 219(T), 223(T), 239(F), 267(F), 
282(T), 284(T), 288(T), 296(F), 305(T), 328(T), 
418(T), 456(F), 471(F). 


A Scale for Personality Rigidity 


by the exclusively male composition of the 
samples used in scale construction. 


Summary 


A summary of the typical connotations of 
the personality rigidity concept was presented. 

By means of an item analysis technique, 
applied to 957 personality inventory items, a 
39-item scale for personality rigidity was de- 
rived. The criterion used for scale construc- 
tion was ratings of rigidity by staff assessors 
from the Institute of Personality Assessment 
and Research made on a total of 330 Ss. The 
highest and lowest 25 per cent on rated ri- 
gidity were used for the item analysis. 

Positive cross-validating evidence for two 
preliminary versions of the scale was pre- 
sented. 


Received April 11, 1957. 


References 


1. Ebel, R. L. Estimation of the reliability of ratings. 
Psychometrika, 1951, 16, 407-424. 

2. Fisher, S. An overview of trends in research deal- 
ing with personality rigidity. J. Pers. 1949, 
17, 342-351. 

. Gough, H. G. Reference handbook for the Gough 
adjective check-list. Unpublished manuscript, 
Univer. of California, Berkeley, Institute of 
Personality Assessment and Research, 1955. 

. Gough, H. G. The California Psychological Inven- 
tory. Palo Alto, California: Consulting Psy- 
chologists Press, 1956. 

. MacKinnon, D. W. The Institute of Personality 
Assessment and Research. Unpublished manu- 
script, Univer. of California, Berkeley, Insti- 
tute of Personality Assessment and Research, 
1951. 

. Rehfisch, J. M. Rigidity, character, and tempera- 
ment: A psychological analysis. Unpublished 
doctoral dissertation, Univer. of California, 
1954. 


15 


Journal of Consulting Psychology 
Vol. 22, No. 1, 1958 


Validity of the Marsh-Hilliard-Liechti MMPI Sexual 
Deviation Scale in a State Prison Population’ 


Join B. Wattron* 


Texas Prison System 


A scale derived irom MMPI items for de- 
tecting sexual deviation tendencies has been 
reported by Marsh, Hilliard, and Liechti (1) 
which selected 88 per cent of sex deviants 
studied at the expense of 11 per cent false 
positives among UCLA students. Subse- 
quently, Peek and Storms (2) used this scale 
in a state hospital, comparing scores of sex 
offenders, mental patients, and psychiatric 
aids. They found the original cutting point of 
30 described by Marsh et al., to be useless in 
discriminating between these groups. A new 
cutting score of 42 was more effective, yet 
neither point differentiated between sex of- 
fenders and mental patients. They concluded 
that the scale measured gross maladjustment, 
rather than sexual deviation tendencies, a pos- 
sibility that the original authors admit. 

In the present work, 60 sexual offender in- 
mates were matched with an equal number of 
other type felons on age, race, and sex. All 
were administered the full scale MMPI. Mean 
scores and overlap comparisons were made 
between these two groups and those of the 
original study by use of present and previ- 
ously determined cutting points on the Sexual 
Deviation scale. Scores of sex offenders on 


1An extended report of this study may be ob- 
tained without charge from John B. Wattron, 806 
West 31 St., Austin, Texas, or for a fee from the 
American Documentation Institute. Order Document 
No. 5433, remitting $1.25 for microfilm, or $1.25 for 
photocopies. 

2 Now in residence, University of Texas. 


the MMPI scales were compared with their 
Sexual Deviation scale scores by correlation 
procedures. 

Mean scores for the felon group and the 
sex offender group were 46.9 (SD, 9.3) and 
49.2 (SD, 9.4) respectively, both of which 
exceed the mean score of 41.8 (SD, 9.1) re- 
ported for sex deviants by Marsh et al. Previ- 
ously determined cutting points were ineffec- 
tive in discriminating between the two groups 
of prisoners. The best cutting score, namely 
48, was inadequate in differentiating between 
groups according to the chi-square test. Cor- 
relations were obtained between Sexual Devia- 
tion scale scores and MMPI scales on inmate 
sex offenders as follows: Pt, .63, and D, .58 
(p= .001); Pd, .52, Sc, .51, and Ma, 49 
(p = .01); and F, .35 (p = .05). 

These results suggest that the Sexual De- 
viation scale is indeed a measure of gross mal- 
adjustment and that this instrument is of no 
practical value in discriminating between sex 
offenders and other type felons in correctional 
settings. 

Brief Report. 
Received September 20, 1957. 


References 


1. Marsh, J. T., Hilliard, Jessamine, & Liechti, R. A. 
A sexual deviation scale for the MMPI. J. 
consult. Psychol., 1955, 19, 55-59. 

2. Peek, R. M., & Storms, L. H. Validity of the 
Marsh, Hilliard, and Liechti sexual deviation 
scale in a state hospital population. J. consult. 
Psychol., 1956, 20, 133-136. 


Journal of Consulting Psychology 
Vol. 22, No. 1, 1958 


The Use of Multivariate Statistical Analysis of 
Minnesota Multiphasic Personality Inventory 
Scores in the Classification of Delinquent 
and Nondelinquent High School Boys 


Peter P. Rempel’ 


University Counseling Service, State University of lowa 


The research worker in psychology and 
education is frequently confronted with the 
problem of analyzing differences between two 
or more groups of individuals with respect to 
several variables. The problem of differentiat- 
ing certain selected groups in terms of MMPI 
scores, for instance, is a good case in point. 
However, the kinds of statistical procedures 
commonly used may be subject to criticism 
on several grounds. Briefly, the several em- 
pirical methods of approach which have been 
developed lack the rigidity of a strictly math- 
ematical model, while such univariate proce- 
dures as the ¢ test or analysis of variance, 
when applied to separate keys of the MMPI, 
fail to take into account the amount of cor- 
relation that exists between scores on these 
keys. These procedures, in short, have failed 
to provide methods for obtaining mathemati- 
cally rigid statements of profile or pattern 
differences. 

It is one of the purposes of this paper to 
provide an illustration of the application of 
more recently developed statistical proce- 
dures, namely the generalized distance func- 
tion (D*) and Fisher's discriminant function 
(LDF) to the solution of problems involving 
the comparison of MMPI profiles. These tech- 
niques have been employed to find answers to 
the following quesiions: 

1. To what extent can groups of nondelin- 
quent and delinquent-prone ninth-grade high 
school boys be differentiated in terms of 

1 Submitted to the graduate school of the Univer- 


sity of Minnesota in partial fulfillment of the require- 
ments for the Ph.D. degree. 


MMPI profiles alone or in combination with 
school record information? 

2. To what extent can ninth-grade boys be 
classified correctly as either nondelinquents or 
potential delinquents in terms of MMPI pro- 
files alone or in combination with school rec- 
ord information? 


Basic Population 


The basic population of this study consists 
of ninth-grade boys enrolled in the public 
high school system of the city of Minneapolis 
for the school year 1947-1948, and to whom 
the MMPI was administered by Hathaway 
and Monachesi (4) in the spring of 1948. Of 
the 2,521 boys enrolled, 1,997 completed the 
test. Invalid test profiles with Z scores of 10 
or over and/or F scores of 15 or over and 
omissions for various other reasons reduced 
the number of valid boys’ profiles to 1,802. 

In January, 1950, the median age of these 
youngsters tested was 17 years. A check made 
of the records of the Hennepin County Proba- 
tion Office and the Juvenile Division of the 
Minneapolis Police Department revealed that 
of those 1,802 boys 376 had appeared before 
the court, the police, or both. A second follow- 
up in 1952 raised the number of delinquent 
boys with valid profiles to 714. These profiles 
were distributed, in terms of severity of mis- 
conduct and number of delinquent acts com- 
mitted, among eight levels of delinquency as 
shown in Table 1. A preliminary study failed 
to reveal any statistically significant differ- 
ences between a randomly selected set of non- 


18 Peter P. 


Table 1 


Distribution of Valid MMPI Profiles by Delinquents, 
Level of Delinquency, and Nondelinquents 


Per Cent Per Cent 
(Based on (Based on 
Classification Number W=1802) N=714) 


All Delinquents 


Levels 1 and 2 
Level 3 
Level 4 
Level 5 
Level 6 
Level 7 
Level 8 


Total Level 3-8 


714 39.62 


20.15 
12.82 
2.61 
1.72 
1.22 
0.67 
0.44 


19.47 
60.37 


50.84 
32.35 
6.58 
4.34 
3.08 
1.68 


Nondelinquents 


Total (all delinquents 


and nondelinquents) 1,802 100.00 


delinquent profiles and those of the delin- 
quents assigned to the lower two levels. This 
left a total of 351 valid delinquent profiles 
distributed among Levels 3-8. 


Rempel 


In the major section of this study, referred 
to as the main study, this set of 351 delin- 
quent profiles was divided by random sam- 
pling procedures into two groups: Group A 
(N = 175) which was used in the develop- 
ment of the classification formula, and Group 
B (N= 176) set aside for cross-validation 
purposes. Two samples of 175 profiles each 
were similarly selected from the nondelinquent 
set of valid profiles. No statistically significant 
differences, in terms of level of delinquency 
and educational status achieved by 1952, were 
found to exist between delinquent samples A 
and B. The nondelinquent samples were found 
to be equal in terms of the latter criterion. 
The number of nondelinquents from a particu- 
lar school included in samples A or B was 
made to match the number of delinquents 
from that school. 

An analysis of three variables; namely, 
family status, mobility rating, and ratings of 
rent level, by means of chi-square techniques, 
failed to discriminate between delinquent and 
nondelinquent samples at statistically signifi- 


Table 2 


Comparison of Sample A Means and Variances for MMPI Scores of Nondelinquent and Delinquent Groups 


Total Delinquents 
s(N = 175) 


Stratified Sample of 
Nondelinquents 
(N = 175) 


Tests of 
Significance* 


Mean 


Mean Variance 


3.90 
13.76 
6.90 


52.86 
52.28 
52.12 


64.18 
50.08 


53.52 
57.47 
62.02 
61.17 


56.66 66.83 


3.81 
13.49 
5.40 


5.77 
17.26 
10.53 


50.59 
51.64 
$1.33 


86.30 
96.17 
71.77 


57.51 
53.58 


110.94 
83.07 


50.68 
54.99 
57.05 
56.95 


72.72 
85.07 
88.68 
101.14 
53.85 


63.18 1.38 


Note.—Similar tests of significance using the ratio d?/S* yielded significant differences for the same eight scales. 
* Level of significance of difference between means and variance estimates has been noted by asterisks: 


* Significance at the .05 level; 
** Significance at the .01 level; 
*** Significance at the .001 level. 


363 
231 
47 
31 
22 
12 
8 1.12 
351 49.16 
1,088 
Scales | Variance P| t P 
L 4.81 12 1.20 
K 15.91 21 1.08 
F 13.30 4.06** 1.26 | 
Hs 67.98 2:42* 1.27 
D 96.19 61 1.00 
Hy 55.51 93 1.29 
Pd 88.45 647° 1.25 
Pa 78.40 3.05** 1.08 
Pt 97.11 2.43* 1.14 
Se 119.56 4.54** 1.35 
Ma 114.14 1.13 
Sie 1.16 | 


The Use of Multivariate Statistical Analysis 


Table 3 


Nondelinquents 
(V = 91) 


Variables Mean Variance 


Testis of 
Significance* 


Delinquents 
(N = 91) 


Mean Variance t 


3.7912 
13.6813 
5.8462 


4.38 
21.06 
14.72 


51.5055 
51.6263 
51.6483 


60.29 
90.94 
49.56 


57.7912 
51.9340 


94.89 
81.11 


52.2418 
55.3516 
57.5824 
56.0330 


88.38 
96.92 
141.17 
141.72 


52.1538 
Days absent 7.0549 
IQ 103.4900 
H.P.R. 2.1687 


71.30 
120.75 
104.71 

3228 


3.8022 
13.7373 
7.1758 


6.23 
19.66 
10.91 


53.3956 
52.4514 
52.0771 


72.96 
96.50 
54.07 


64.9011 
49.0440 


114.10 
80.71 


54.3297 
57.6593 
62.7912 
61.4835 


80.43 
67.01 
73.18 
137.91 


53.0549 
12.8791 
99.6800 

1.7171 


52.71 

32.76 

130.41 
5574 


® The level of significance is indicated by asterisks: 
* .05 level of significance; 
** 01 level of significance. 


cant levels. To the extent that these three fac- 
tors tend to reflect the socioeconomic status 
of the student population under discussion, 
delinquents and nondelinquents may be re- 
garded as coming from essentially the same 
background. 

A comparison of the means and variances 
for the MMPI scores of delinquents and non- 
delinquents is presented by means of Table 2. 
Choice of MMPI variables for inclusion in 
the classification formula was made on the 
basis of their potential contribution to the 
generalized distance function, D*. One statis- 
tic which provides a measure of this contribu- 
tion is the ratio of the square of the mean 
group difference to an estimate of the vari- 
ance of the distribution for each variable, re- 
ferred to as d*/S*, another the measure of 
correlation between variables. The combina- 
tion of variables resulting in the widest pos- 
sible separation of the two groups in terms of 
D? were the scores on the F, Hs, Sc, Mf, and 
Pd scales, arranged in that order. 


The School Study 


A second part of the study, referred to as 
the school study, dealt with the problem of 
identifying juvenile delinquents on the basis 
of a combination of MMPI scores and school 
record data. The same basis population was 
used in both studies. However, the group 
from which experimental samples were chosen 
in the school study consisted of a total of 869 
boys used by Roessel (12) in his study of 
high school dropouts. Of these, 412 had left 
high school without graduating, while the rest 
(N = 457) constituted a random sample of 
those who had graduated. Of these 869 boys, 
182 proved to be delinquents with valid 
MMPI profiles. From the remaining 687 pro- 
files, a control group of 182 nondelinquent 
profiles was chosen. As in the main study, 
both delinquent and nondelinquent groups 
were divided into Samples A’ and B’. Chi- 
square tests revealed no statistically signifi- 
cant differences in terms of delinquency level 


19 
Comparison of Sample A’ Means and Variances for the Variables of the Delinquent and 
Nondelinquent Groups in the School Study 
F 
L 03 1.42 
K 08 1.07 \ 
PF 2.49* 1.35 
Hs 1.55 1.21 
D 06 1.06 
Hy 40 1.09 
Pd 4.67** 1.20 
Pa 1.53 1.10 
Pt 1.71 1.45* 
Se 3.38” 1.93** 
Ma 1.19 
50 1.35 
4.96** 3.68** 
2.36* 1.25 
4.57** 1.73°° 


20 Peter P. 


and educational status between the delinquent 
samples of the school and the main studies, 
nor between the nondelinquent samples of the 
two studies in terms of the latter criterion. 
Comparisons of mean scores and variance es- 
timates for Sample A’ are reported by means 
of Table 3. Variables selected in terms of 
maximum contributions to D* were scores on 
the Ma, Sc, Mf, and Pd scales of the MMPI, 
honor point ratio, and number of days ab- 
sent, in that order. 


Discussion of Techniques Employed 


Rao (9) has shown that the degree of over- 
lap between any two clusters of points in a 
multidimensional space is measured by the 
generalized distance function, D*. The greater 
the numerical value of D*, the smaller the 
amount of overlap, so that this measure may 
be regarded as the “distance” between groups. 

To eliminate intervariable correlations Rao 
(11) suggested a method of transforming the 
original set of variables to an orthogonal set. 
Collier (3) presented illustrations of the uses 
of Rao’s pivotal condensation techniques to 
effect the transformation in the process of 
computing D*. Methods for testing the sig- 
nificance of D* have also been developed by 
Rao (8, 9, 10). 

The application of Fisher’s discriminant 
function to the problem of classification into 


Table 4 


Rempel 


one or two known multivariate normal popu- 
lations was discussed by Anderson (1). Tests 
of the effectiveness of the LDF in classifying 
individuals correctly were presented by Lubin 
(7). For complete and comprehensive discus- 
sions of the techniques employed in this study 
the reader is referred to Christensen (2), 
Lockman (6), and Johnson (5). 


Results 


In the main study, analysis of the data 
yielded a theoretical probability of correct 
classification for both delinquents and non- 
delinquents of 68.39 per cent. The overall 
percentage is essentially the same as that 
achieved by chance when the ratio of non- 
delinquents to delinquents in the porulation 
is known to be 80.5 to 19.5 per cent, as it is 
in this study. However, classification of de- 
linquents by the LDF is more than three 
times as effective as chance procedures, the 
formula identifying 68.4 per cent of the de- 
linquents as against 19.5 per cent identified 
by chance. 


Empirical Cross Validation of Samples 
A and B 


The actual effectiveness of the linear dis- 
criminant function was tested empirically by 
applying thé formulas developed on Sample A 
to the delinquent and nondelinquent sub- 


A Summary Comparison of Classification Efficiency for the LDF Procedure with Chance Expectation 
for Samples A and B in the Main Study by Delinquent and Nondelinquent 


Groups within the Sample 


Correct 


Correct 


Classification Classification 
by Formula by Chance Percentage 
Improvement 
Samples Used % over Chance* 


Sample A (Dels.) (N = 175) 120 
Sample A (Nondel.) (WN = 175) 115 
Total Sample A (N = 350) 


Sample B (Dels.) (V = 176) 124 
Sample B (Nondel.) (V = 175) 103 
Total Sample B (V = 351) 


All Groups (VN = 701) 


68.6 
65.7 
67.1 


70.5 
58.9 
64.7 


65.9 


87.5 50.0 18.6** 
87.5 50.0 15.7°° 
175.0 50.0 ae 


88.0 50.0 20.5** 
87.5 50.9 8.9* 
175.5 50.0 14.7** 


50.0 


* Level of significance is indicated by means of asterisks: 
* 05 level of significance; 
of significance. 


The Use of Multivariate Statistical Analysis 


Table 5 


Classification of Delinquents by I evel of Delinquency Using LDF Procedures for Two Randomly Chosen 
Samples of the Main Study : Sample A on Which Prediction Formulae Were Developed and 
Sample B on Which the Formulae Were Applied Through Cross Validation 


Percentage of Correct Classification 


Sample A 
(N = 175) 


% 


Sample B 
(N = 176) 


N /0 


70.7 

60.9 

46.7 

81.8 

66.7 

100.0 

Total 


351 68.6 


67.8 
75.0 
75.0 
73.0 
83.3 
75.0 


18 
12 


124 70.5 69.5 


* In addition to the 351 delinquents in Levels 3 to 8 above which had been randomly assigned to Samples A and B, 363 pupils 


had been classified as delinquents of Levels 1 and 2. 


Application of the same procedures identified 75 or 42.3% of the Level 1 


group as delinquent and 111 or 60.0% of the Level 2 group, an overall correct classification of 51.2%, approximately the same as 


chance expectation. 


samples in both Samples A and B. A study of 
cross-validation results as presented in Table 4 
indicates that the overall percentage of correct 
classifications, namely 65.9 per cent, fell some- 
what short of the predicted percentage of 68.4. 
The fact that the formula correctly identified 
only 58.9 per cent of the nondelinquents in 
Sample B accounts for most of the shrinkage. 
However, 68.6 per cent of delinquents in Sam- 
ple A, and 70.5 per cent in Sample B were 
correctly identified. 


Classification results were also broken down 
in terms of delinquency level, as presented in 
Table 5. The tabulation shows generally in- 
creasing proportions of correct classification 
with level of delinquency, Level 5 being the 
only exception. 


Results of the School Study 


The probability of correct classification for 
both delinquents and nondelinquents was 
found to be 70.9 per cent. 


Table 6 


A Summary Comparison of Classification Efficiency for the LDF Procedures with Chance Expectations 
for Samples A’ and B’ in the School Study by Delinquent and Non- 
delinquent Groups within the Sample 


Correct 
Classification 
by Formula 


Samples Used N 


Correct 
Classification 
by Chance Percentage 
Improvement 


% N % over Chance* 


Sample A’ (Dels.) (V = 91) 
Sample A’ (Nondels.) (V = 91) 
Total Sample A’ (N = 182) 


60 


Sample B’ (Dels.) (NV = 91) 
Sample B’ (Nondels.) (V = 91) 
Total Sample B’ (VN = 182) 


Total Samples A’ and B’ 
(N = 364) 


65.9 
79.1 
72.5 


69.2 
69.2 
69.2 


70.9 


45.5 
45.5 
91.0 


22.5*° 


45.5 
45.5 
91.0 


19.2** 
19.2°* 
20.9** 


182.0 50.0 


* The level of significance is indicated by means of asterisks. Thus, two asterisks indicate significance at the .01 level. 


21 
(N = 351) 
Level* Total N N % 
78 160 69.3 
| 32 68.1 
19 61.3 
8 17 77.3 
5 9 75.0 
3 7 87.5 
= 50.0 
72 50.0 
132 50.0 
63 50.0 
63 50.0 
126 50.0 


Peter P. Rempel 


Table 7 


Classification of Delinquents by Level of Delinquency Using LDF Procedures for Two Samples of the School 
Study: Sample A’ on Which Prediction Formulas Were Developed and Sample B’ 
on Which the Formulas Were Applied Through Cross Validation 


Sample A’ 
(N = 91) 


% 


= 


Total 
(N = 182) 


N 0 


w 


71.70 
60.00 
75.00 
71.43 
50.00 
50.00 


69.23 


65.71 
67.74 
71.43 
76.97 
57.14 
80.00 


67.58 


Results of cross-validation procedures (see 
Table 6) indicate that classification by the 
LDF permitted the correct identification of 
65.9 per cent of the delinquents in Sample A’ 
and of 69.2 per cent in Sample B’. Corre- 
sponding figures for nondelinquents are 79.1 
per cent and 69.2 per cent for Samples A’ and 
B’ respectively. In each case improvement 
over chance is significant at the .01 level. 

Although the two studies are not strictly 
comparable, the addition of school record data 
raised the overall predictive efficiency of the 
LDF from 65.9 per cent in the main study to 
70.9 per cent in the school study, a difference 
significant at the .05 level. 

Classification results in terms of the level 
of delinquency are presented by means of 
Table 7. Again, an increase in the proportion 
of correct classifications corresponds closely 
with increase in level of delinquency, a 
marked drop in predictive efficiency at Level 
7 being the only exception in the general 
trend. 

Discussion 

Unlike a number of empirical methods 
which have been suggested for the comparison 
of multiphasic profiles, the evidence presented 
here suggests that multivariate statistical 
techniques provide a rigid measure of the dis- 
tance between two groups in terms of all the 
available numerical information. Unlike the 
measures achieved as a result of the applica- 
tion of univariate procedures to separate scales 
individually, the generalized distance function 


provides a single unit of measurement which 
is communicable, and lends itself particularly 
to the purpose of making comparisons from 
one study to another. Unlike univariate proce- 
dures, multivariate techniques may be used for 
the purposes of classification and prediction. 

The relatively high percentage of correct 
classifications achieved in this study, in spite 
of the well-known high degree of intercorrela- 
tion of MMPI variables * and the numerically 
small average group differences on individual 
variables, may serve as an indication of the 
even greater usefulness of these techniques in 
research studies where these limiting factors 
do not exist to the extent they were found in 
the present investigation. 


Summary 

This study has demonsirated the usefulness 
of applying multivariate statistical techniques 
to the analysis of MMPI scale scores alone, 
or in combination with school data, for the 
purpose of classifying ninth-grade boys as 
potential delinquents or nondelinquents. The 
techniques employed proved to be effective to 
the extent that 62.3 per cent of the nondelin- 
quents and 69.5 per cent of the delinquents 
were correctly identified by the use of multi- 
phasic data alone; while a combination of 

2 Thus the median product-moment correlation co- 
efficients for the five variables of Sample A in the 
main study was .30, with a range of .06 to 49; cor- 
responding figures for the six variables of Sample A’ 


of the school study were: median 04; range — .28 
to 36. 


w= 
(V = 91) 
Level N N % 
3 106 31 $9.62 
4 30 1275.00 21 
5 21 6 6.67 15 
6 13 5 83.33 10 
7 7 3 60.00 4 
8 5 3 100.00 4 
Total 182 60 65.98 123 


The Use of Multivariate Statistical Analysis 23 


multiphasic and school record data made pos- 
sible the correct identification of 74.2 per cent 
of nondelinquent boys and 67.5 per cent of 
delinquent-prone boys. 


Received March 22, 1957. 


References 


. Anderson, T. W. Classification by multivariate 
analysis. Psychometrika, 1951, 16, 31-50. 

. Christensen, C. M. Multivariate statistical anal- 
ysis of differences between pre-professional 
groups of college students. J. exp. Educ., 1953, 
21, 221-232. 

. Collier, R. O. Some application of the method of 
pivotal condensation in statistical analysis. J. 
exp. Educ., 1953, 21, 233-241. 

. Hathaway, S. R., & Monachesi, E. D. Analyzing 
and predicting juvenile delinquency with the 
MMPI. Minneapolis: Univer. of Minnesota 
Press, 1953. 

. Johnson, M. C. Classification by multivariate 
analysis with objectives of minimizing risk, 
minimizing maximum risk, and minimizing 


probability of misclassification. J. exp. Educ., 
1955, 23, 259-264. 


. Lockman, R. A. An evaluation of naval aviation 


cadet selection measures using multivariate dis- 
criminatory statistical techniques. Unpublished 
doctoral dissertation, Univer. of Minnesota, 
1954. 


. Lubin, A. Linear and nonlinear discriminating 


functions. British J. Psychol. statist. Sec., 
1950, 3, 90-103. 


. Mahalanobis, P. C., Majundar, D. N., & Rao, 


R. C. Anthropometric survey of the United 
Provinces, 1941: a statistical study. Sankhya, 
1949, 9, 89-324. 


. Rao, C. R. The problem of classification and dis- 


tance between two populations. Nature, 1947, 
160, 835-836. 


. Rao, C. R. Tests of significance in multivariate 


analysis. Biometrika, 1948, 35, 58-79. 


. Rao, C. R. Advanced statistical methods in bio- 


metric research. New York: Wiley, 1952. 


. Roessel, F. P. Minnesota Multiphasic Personality 


Inventory results for high school drop-outs 
and graduates. Unpublished doctoral disserta- 
tion, Univer. of Minnesota, 1954. 


1 

4 


Journal of Consulting Psychology 
Vol. 22, No. 1, 1958 


Progressive Matrices (1938) and 
Emotional Disturbance’ 


Sidney Kasper 
Evansville State Hospital 


The Progressive Matrices (PM), unlike 
other indices of intelligence, has to date re- 
ceived little attention with essentially psy- 
chotic populations. An instrument intended as 
a measure of “clarity of thinking” and “rea- 
soning ability” might legitimately be expected, 
in the presence of illness, to reveal certain 
functional differences. Sensitivities of this 
type are thought evincible in the PM when 
contrasted with a verbal test relatively re- 
sistant to the effects of disturbance, such as 
vocabulary. This paper summarizes: (a) PM 
and the Wechsler Adult Intelligence Scale 
Vocabulary test (WAIS V) relationships to 
morbidity scores derived from the Lorr Multi- 
dimensional Rating Scale for Psychiatric Pa- 
tients (MSRPP) and (0) the relationship to 
morbidity of two points or more deviation be- 
tween expected and actual scores in any of 
PM’s five sets of 12 items each. These devia- 
tions have been thought to possess “psycho- 
logical significance.” 

Subjects included 50 consecutive first ad- 
missions to Evansville State Hospital who had 
not yet attained their 56th year, who showed 
no apparent organic involvements, and were 
capable of complying with instructions.* WAIS 
V scores were prorated and then grouped so 
as to fall within one of three categories: (a) 
below 90 IQ, (6) 90-110 IQ, and (c) above 
110 IQ. PM percentile scores for the same 
subjects were converted and grouped similarly. 


1An extended report of this study may be ob- 
tained without charge from Sidney Kasper, Evans- 
ville State Hospital, Evansville, Indiana, or for a fee 
from the American Documentation Institute. Order 
Document No. 5431, remitting $1.25 for microfilm 
or $1.25 for photocopies. 

2The author is especially indebted to Don R. 
Brown for the statistical analysis of the data. 


24 


PM and WAIS V are in 48% agreement. 
Their differences do not permit rejection of 
the null hypothesis and may be attributable 
to chance. Contrary to prediction. 82% (N 
= 41) of the cases may be described as scor- 
ing as high or higher on the PM than on 
WAIS V, with PM scores being greater in 
34% (N = 17) of the total cases. These find- 
ings are not attributable to differences in age 
or educational attainment and are at variance 
with the expected trend. The .50 correlation 
between PM and WAIS V (p < .05) is essen- 
tially similar to those of previous investiga- 
tions using similar measures with less severely 
disturbed samples or with normals. 

No meaningful relationship is found be- 
tween ratings of morbidity and estimates of 
intellectual functioning. Differences in mean 
morbidity scores for the various discrepancies 
between intelligence measures do not clearly 
demonstrate a concomitant variation between 
events. Similarly, with respect to the specific 
MSRPP scales, “conceptual disorganization” 
and “perceptual distortion,” nonsignificant 
changes in these morbidity scores are asso- 
ciated with differences between intellectual 
measures. The results, then, consistently fail 
to demonstrate PM’s efficiency in discriminat- 
ing the influence of pathology. 

Finally, none of the disparities among PM 
sets of scores is meaningfully related to mor- 
bidity score. Therefore, clinical use of this 
feature of the test would seem severely limited. 

The results, in general, point to consider- 
able ambiguity and lack of validity in inter- 
pretations of PM scores with heterogeneous 
clinical populations. 


Brief Report. 
Received October 14, 1957. 


Journal of Consulting Psychology 
Vol. 22, No. 1, 1958 


The Polarity of Psychological Tests 


Marshall B. Jones’ 
US Naval School of Aviation Medicine, Pensacola, Florida 


A psychological test may be named after 
either of its extremes. The same test, for ex- 
ample, might be named either “Rigidity” or 
“Flexibility,” depending upon which end of 
the continuum is to be emphasized. Which of 
the two names we use is commonly considered 
a matter of convention. Implicit in this con- 
vention is the assumption that both ends of 
the continuum are equally meaningful. Con- 
ceptually, however, the dimension we intend 
to measure may be very far from meeting this 
assumption. “Anxiety” is a good example. We 
recognize immediately that the idea of anxiety 
is far clearer than is the idea of nonanxiety. 
The anxious end of the conceptual continuum 
is more meaningful or, if you like, has a more 
consistent meaning than has the nonanxious 
end. If this difference between the two ends 
of the conceptual continuum is reflected in 
the test itself, the homogeneity of items fall- 
ing at one end of the continuum will be 
greater than the homogeneity of items falling 
at the other end. Put more specifically, the 
average intercorrelation among difficult * items 
will be greater (or less) than the average 
intercorrelation among easy items. To the ex- 
tent that there is a relationship between item 
difficulty and average interitem correlation we 
will speak of the test as polarized. The pur- 
pose of this paper is briefly to consider the 
scope and consequences of test polarity. 


1Qpinions or conclusions contained in this paper 
are those of the author. They are not to be con- 
strued as necessarily reflecting the view or the en- 
dorsement of the Navy Department. 

2 Throughout this paper we shall use the termi- 
nology of aptitude testing in referring to the pro- 
portion of persons responding with the key to a 
given item. The terminology of “difficult” and “easy” 
items sounds strange in the context of personality 
testing, but there does not seem to be any equally 
well-understood alternative. 


An Illustration 


A discussion of polarity is perhaps best in- 
troduced by an illustration. The Pensacola Z 
Survey (4) contains five scales. Four of these 
correspond to familiar clinical ideas: Anxiety, 
Dependency, Rigidity, and Hostility.* Each of 
the four scales contains forty items in forced- 
choice form, with ro item overlap between 
scales. All four scales are substantially homo- 
geneous (Hoyt coefficients ranging from .82 to 
.90) and consist of items which most psychol- 
ogists would recognize as related to the trait 
the scale is intended to measure. All four 
scales are scored in the neurotic direction. 
Thus, a high score indicates Anxiety, De- 
pendency, Rigidity, or Hostility. The distri- 
butions of the four scales divide them into 
two pairs: Anxiety and Hostility, which are 
skewed positively, and Dependency and Rig- 
idity, which are skewed negatively. To illus- 
trate polarity the items of each scale were 
broken down into four groups according to 
difficulty level. Because of the differences in 
skew between the Anxiety-Hostility pair and 
the Dependency-Rigidity pair the same cate- 
gories would not do for both pairs. To 16 of 
the Anxiety items, for example, and to 12 of 
the Hostility items, less than 30% of the sub- 
jects responded in the neurotic direction, while 
to none of the Dependency and to only three 
of the Rigidity items did less than 30% re- 
spond in the neurotic direction. For Anxiety 
and Hostility, therefore, the items were di- 
vided into items to which less than 30% of 
the subjects responded in the neurotic direc- 
tion, between 30% and 40%, between 40% 
and 50%, and over 50%. For Dependency 

3 The Z survey contains a fifth scale, called Heter- 


onomy, which is not included here as its items over- 
lap with all four of the other scales. 


26 Marshall B. Jones 


Table i 


Average Tetrachoric Correlations Between Z-Survey 
Items, Grouped by Scale and Difficulty 
Level, Together with the Num- 
bers of Items Involved 


Scale ps4 A<pss 
Anxiety .314(11) .235(9) _.195 (4) 
Hostility 221 (12) .196(8) 145(8) .14¢ (12) 
 S<ps6 6<ps.7 
Dependency .173(6) .173(S) .208 (12) .222 (17) 
Rigidity 244 (11) .242(8) 23511) 383 (10) 


and Rigidity, the items were grouped a¢cord- 
ing as the proportion of subjects respending 
in the neurotic direction was less than: 50%, 
between 50% and 60%, between 609 and 
70%, and over 70%. Tetrachoric correlations 
were available between all items of: each 
scale. The average correlation within each 
group of items was then obtained. In Table 
1 these averages are presented together with 
the number of items in the group. A glance 
at Table 1 shows that polarity is present.* In 
Anxiety, for example, the smaller the propor- 
tion of people who respond in the anxious 
direction the higher the homogeneity. To a 
lesser degree the same trend obtains for Hos- 
tility. The neurotic ends of these two scales, 
Anxiety and Hostility, are most consistent. 
Precisely the reverse obtains with respect to 
Dependency and Rigidity. The larger the pro- 
portion of persons responding in the depend- 
ent direction, the higher the homogeneity of 
the items. For these two scales, independence 
and flexibility are the more consistent ends 


* As has already been mentioned, Anxiety and Hos- 
tility are skewed positively, ie., toward anxiety anc 
hostility, while Dependency and Rigidity are skewed 
negatively, ie., toward independence and flexibility. 
In all four instances, therefore, the direction of skew 
lies toward the more polarized end of the scale. This 
relationship is not accidental. There is an algebraic 
relationship between polarity and skewing. However, 
other variables, eg., average item difficulty, affect 
skewing as well as polarity. A variable might well be 
skewed and entirely unpolarized at the same time. 
However, to the extent that a test is polarized it will 
tend also to be skewed. Thus, though the two phe- 
nomena cannot be equated, the existence of skew 
constitutes good reason to look for polarity. 


of the scale, though for Dependency the re- 
lationship is not strong. Polarity, then, defi- 
nitely does obtain, at least in the scales of the 
Z survey. However, the nature of the Z survey 
and its construction lead us to believe that 
very much the same results would be obtained 
with other personality questionnaires. 


Homogeneity 


To the extent that a test is polarized we 
may expect the homogeneity of the test to be 
greater (or less) in a high-scoring sample of 
subjects than in a low-scoring sample. This 
proposition asserts the principal pragmatic 
consequence of test polarity. To establish its 
truth let the total score 7; for subject i be 
divided into two components X, and Y;, where 
X;, is the subject’s score on the easier half of 
the items and Y;, is his score on the harder 
half. In addition, consider two samples of sub- 
jects: a high-scoring sample and a low-scoring 
sample. Most of the subjects in the high- 
scoring sample will answer most of the easy 
items correctly. Therefore, for the high-scor- 
ing group the standard deviation, Sx, will be 
small in comparison to Sy. In the low-scoring 
sample, however, most of the subjects will 
answer most of the hard items incorrectly. 
Therefore, for them Sy will be small in com- 
parison to Sy. Our proposition asserts that if 
the test is polarized the “homogeneity” of the 
test in the high-scoring sample will be differ- 
ent from its homogeneity in the low-scoring 
sample. To come to grips with this proposi- 
tion it is necessary to be more specific about 
homogeneity, which means, essentially, what 
Cronbach (1) has called equivalence, “the de- 
gree to which the test score indicates the sta- 
tus of the individual at the present instant in 
the general and group actors defined by the 
test.” There is a fairly wide choice of co- 
efficients: any one of the Guttman lower 
bounds (2), including the Kuder-Richardson 
(7), an analysis of variance coefficient (3), or 
Loevinger’s homogeneity coefficient (8). The 
proposition at issue could be established with 
any one of these. However. it is most edsily 
established with the Kuder-Richardson for- 
mula which is at the same time the most fa- 
miliar of the available coefficients. 

In the low-scoring sample let Sy = KSy, 
where K > 1. For the sake of simplicity we 


The Polarity of Psychological Tests 27 


will assume that this relationship is exactly 
reversed in the high-scoring sample, i.e., that 
Sy = KSy. In both samples the Kuder- 
Richardson coefficient is given by 


= 1 SP 


where 2N is the number of items in the test 
(both halves) and where g ranges over all 
2N items. By the algebraic manipulation of 
the formula for rrr we find that 


N-1 N-1 
2N 
in the low-scoring sample and that 


N-1 N-1 


1+ K*+2Arxy 


in the high-scoring sample, where ryx and ryy 
are the Kuder-Richardson coefficients for X 
and Y respectively and where rxy is the prod- 
uct-moment coefficient between X and Y.° As 
is clear from these two equations, if ryx and 
ryy are equal the homogeneity coefficients in 
the two samples are identical. In this context, 
equality of ryy and ryy is tantamount to no 
polarity. Therefore, if there is no polarity, the 
two homogeneity coefficients are equal. On the 
other hand, if rrxy * ryy, polarity exists. In 
terms of the present model degree of polarity 
is given by the difference, ryy — ryy. To the 
extent that yyy is greater than ryy, the homo- 
geneity of the test in the low-scoring sample 
will be greater than its homogeneity in the 
high-scoring sample; and conversely. If a test 
is polarized toward anxiety, for example, the 
difference between two anxious subjects will 


5In these formulas rrx is assumed to have the 
same value in the low-scoring sample as in the high- 
scoring sample; in a similar way, ryy is supposed to 
remain constant in the two samples. These assump- 
tions are tantamount to supposing that X and FY are 
not themselves polarized. However, if the entire test 
(X¥ + Y) is polarized, its parts, X and Y, will very 
probably be polarized as well. By assuming that X 
and Y are not polarized we minimize the effects of 
polarity on homogeneity. Polarization within the two 
halves as well as between them would simply rein- 
force the effects on homogeneity of the difference in 
polarity between the hard and the easy halves of the 
test. 


be more reliable than a difference of the same 
magnitude between two nonanxious subjects. 
The implications for tests which are used or 
intended to be used in both clinical and nor- 
mal populations are clear. If the test is polar- 
ized, its internal properties cannot remain the 
same in both populations. If track were kept 
of polarity, we might be able to account for 
some of the instances in which a clinically 
developed instrument “falls apart” in a nor- 
mal population and vice versa. 


Significance 


In order to deal effectively with polarity, 
one must be able to say whether the polarity 
of a given test is statistically significant or 
not. The means about to be described for 
meeting this requirement are manifestly im- 
perfect. However, the problem at issue is not 
simple; there may be no practical yet logically 
rigorous and conceptually adequate way of 
assessing the significance of polarity. 

Table 1 expresses the notion of polarity in 
terms of the average tetrachoric correlation 
within each of four groups of items, the items 
being arranged in order of difficulty. To ob- 
tain a coefficient from these data one needs 
only to calculate a rank correlation coefficient 
(6,9) between the average tetrachoric * corre- 
lation within a group and the difficulty level 
of the group. However, with only four groups 
much in the way of significance is not likely 
to be established. The computational load, 
moreover, is considerable. In Table 2 the re- 
sults of breaking the scales of the Z survey 
into groups of three items each are set forth. 
There are 13 groups for each scale, begin- 
ning with the three most difficult items and 
ending with the three least difficult. In all 


6 Throughout this paper tetrachoric coefficients are 
used. Our use of the tetrachoric in preference to the 
phi coefficient needs some justification. The sampling 
distribution of the phi coefficient is not known. How- 
ever, whatever the distribution may be, its expected 
value must be dependent upon item difficulty. If we 
sample from a population with a known value of phi, 
the expected value of phi in the sample will depend 
upon the item difficulties of the items involved. The 
idea of polarity, however, turns entirely on the rela- 
tionship between item difficulty and inter-item corre- 
lation. In consequence, the use of the phi coefficient 
would completely confound polarity with statistical 
artifact. On the other hand, the expected value of the 
tetrachoric coefficient is independent of item difficulty. 


28 


Marshall B. Jones 


Table 2 


Average Tetrachoric Correlations Within Groups of Three Items Arranged in Order of Difficulty, Together with 


the Corresponding Polarity Coefficients, for the Four Z-Survey Scales 


Item Group 


Scale 7 8 9 13 


four scales the item which ranked 19th was 
dropped. As is clear from the table, the results 
yield very much the same picture as did the 
results set forth in Table 1. However, the 
number of tetrachorics which had to be calcu- 
lated for each scale has been reduced from 
around 200 to 39. At the same time the 
available for a test of significance has in- 
creased from four to thirteen. However, even 
with groups of three items each the m for the 
test of significance is small. In Table 2 the 
rank correlations (tau) for the four scales ap- 
pear. Of these, only the coefficient for Anxiety 
is significant. Fortunately, the m can be con- 
siderably increased. It is not necessary to 
treat the entire, undivided sample. The total 
sample may be divided into subsamples, and 
a coefficient calculated which the writer has 
called “a partial rank correlation coefficient” 
(5) and which Torgerson has called “a rank 
correlation coefficient within subgroups” (10). 
In general, the value of the partial coefficient 
will approximate the value obtained with the 
whole sample. However, the standard error of 
the partial coefficient will equal exactly (&)-+ 
times the standard error of the original co- 
efficient, where & is the number of subsamples. 
The use of subsamples has the effect of simply 
increasing the m upon which the tes: of sig- 
nificance is based. The number of subsamples 
into which the total may be broken is limited 
primarily by the nature of the tetrachoric co- 
efficient. So many subsamples may not be used 
that one or more of the cells involved in cal- 
culating the tetrachorics became empty, since 
in that circumstance the tetrachoric would not 
be defined. In practice, the actual subsam- 
pling is not needed. The data of Tables 1 and 
2, for example, are based on 310 subjects and 


to no item does a majority greater than .85 
respond in the same way. Therefore, the sam- 
ple may be separated into five subsamples of 
62 subjects each with some confidence that 
the tetrachorics will retain their meaning. The 
standard error for the original taus was .21; 
therefore, the standard error for the partial 
taus will be .115. Since the values of the par- 
tial tau will approximate the values based on 
the entire sample, Dependency is definitely 
polarized. Hostility and Rigidity, however, 
would seem not on the evidence presented to 
have been established as polarized. For most 
purposes this general method, though far from 
elegant, seems to give an adequate evaluation 
of polarity. 


Summary 


Many psychological traits are polarized in 
the sense that one end of the conceptual con- 
tinuum is clearer, i.e., has a more consistent 
meaning, than the other. In the first section 
of this paper, the four peripheral scales of the 
Pensacola Z Survey were all found to involve 
some degree of polarity. Polarization of the 
scale, Anxiety, was particularly powerful. The 
construction and content of the Z survey is, 
moreover, quite typical of personality inven- 
tories in general. It is, therefore, reasonable to 
suppose that many, if not most, personality 
scales will be found to reflect the polarity of 
the conceptual continuum they are supposed 
to measure. The primary significance of polar- 
ity is that to the extent that a test is polarized 
the homogeneity of the test in high-scoring 
samples will be greater (or less) than its 
homogeneity in low-scoring samples. In the 
second section of this paper this result was 
established. In the third and last section, a 


The Polarity of Psychological Tests 29 


procedure was proposed by which the degree 
of polarity a test exhibits can be tested for 
significance. The procedure involves the cal- 
culation of Kendall’s tau between the average 
tetrachoric correlation within groups of items 
and the difficulty of the item group in which 
the tetrachoric average obtains. The tau is 
then tested for significance. 


Received March 21, 1957. 


References 


1, Cronbach, L. J. Test “reliability”: Its meaning 
and determination. Psychometrika, 1947, 12, 
1-16. 

2. Guttman, L. A basis for analyzing test-retest re- 
liability. Psychometrika, 1945, 10, 255-282. 

3. Hoyt, C. Test reliability obtained by analysis of 
variance. Psychometrika, 1941, 6, 153-160. 


4. Jones, M. B. The Pensacola Z Survey: A study 
in the measurement of authoritarian tendency. 
Psychol. Monogr., in press. 

. Jones, M. B. An addition to Schaeffer and 
Levitt’s “Kendall’s Tau.” Psychol. Bull., 1957, 
54, 159-160. 

. Kendall, M. G. Rank correlation methods. Lon- 
don: Griffin, 1948. 

. Kuder, G. F., & Richardson, M. W. The theory 
of the estimation of test reliability. Psycho- 
metrika, 1937, 2, 135-138. 

. Loevinger, Jane. A systematic approach to the 
construction and evaluation of tests of ability. 
Psychol. Monogr., 1947, 61, No. 4 (Whole No. 
285). 

. Schaeffer, M. S., & Levitt, E. E. Concerning Ken- 
dall’s Tau, a non-parametric correlation co- 
efficient. Psychol Bull., 1956, 53, 338-346. 

10. Torgerson, W. S. A non-parametric test of corre- 
lation using rank orders within subgroups. 
Psychometrika, 1956, 21, 145-152. 


Journal Psychology 
Vol. 22, No. 1, 195: 


An Experimental Group Version for School Children 
of the Progressive Matrices’ 


Read D. Tuddenham, Louis Davis, Leslie Davison, 
and Richard Schindler 


An experimental group test version of Rav- 
en’s Progressive Matrices was prepared by 
the “ditto” reproduction process, to explore 
the suitability of the test for American grade- 
school children. Features of the experimental 
version were elimination of the separate an- 
swer sheets and of the color printing used in 
Raven’s 1947 edition. Two classrooms, se- 
lected to represent different socioeconomic lev- 
els, were tested at each grade level from third 
through sixth. 

Mean scores showed the expected progres- 
sion from grade to grade. Reliability coeffi- 
cients for separate grade levels range from .87 
to .94. 

Tentative percentile norms were presented 
for each grade level. Despite the relative 
crudity of our dittoed version and the small 
samples tested, the norms are very close to 
those presented by Raven for Scottish chil- 
dren. Our data do not show the consistent 
superiority of American children reported by 
Green and Ewert (1). 

Percentages passing per item are reported 
separately by grade. While the actual diffi- 
culty values differ from those of Green and 
Ewert, the order of items by difficulty corre- 
sponds very closely to their findings. It shows 
marked consistency from grade to grade within 
our own data but differs appreciably from the 
published sequence. 

Differences between boys and girls in mean 
scores and in standard deviations for separate 
grade levels are inconsistent and very small 


1 An extended report of this study may be obtained 
without charge from R. D. Tuddenham, Dept. of 
Psychology, University of California, Berkeley 4, 
Calif., or for a fee from the American Documenta- 
tion Institute. Order Document No. 5429, remitting 
$1.75 for microfilm or $2.50 for photocopies. 


University of California 


in magnitude. Sex differences appear not to 
be related to test performance. 

Differences between upper-middle-class and 
working-class schools are very substantial and 
highly significant. Despite the novel content 
of Progressive Matrices and its freedom from 
dependence on language, the test reveals 
marked differences between children from dif- 
ferent socioeconomic levels. 

Correlations between Progressive Matrices 
and Kuhlmann-Anderson, California Mental 
Maturity, and Lorge-Thorndike tests are simi- 
lar in magnitude, clustering around .40 or .45, 
despite an interval of as long as two years be- 
tween the administration of the two tests cor- 
related. Green and Ewart report similar values. 

In conclusion, the Progressive Matrices 
seem to be well adapted to group testing in 
elementary schools. It is brief, reliable, and 
capable of engaging the interest and atten- 
tion of school children of the grade levels in- 
vestigated. Our results indicate that a black 
and white version is satisfactory for children 
as young as third graders, and marking an- 
swers on the booklets instead of on separate 
answer sheets avoids a possibly more serious 
source of confusion than any which might re- 
sult from the absence of color. In view of the 
great potential usefulness of a nonlanguage 
test at once so brief, reliable, and broad in 
range of difficulty, it is to be hoped that Pro- 
gressive Matrices will soon be made available 
in an inexpensive “throw away” booklet form. 
Brief Report. 

Received October 21, 1957. 


Reference 


1. Green, M. W., & Ewert, Josephine C. Normative 
data on Progressive Matrices (1947). J. con- 
sult. Psychol., 1955, 19, 139-142. 


Journal of Consulting Psychology 
Vol. 22, No. 1, 1938 


Test Anxiety Level and Goal-Setting Behavior 


E. Philip Trapp and Donald H. Kausler 


University of Arkansas 


In recent years, the effects of anxiety as a 
drive state have been studied on a diversity 
of activities, ranging from simple motor tasks 
to complex verbal problems (e.g., 1, 3, 4, 7, 
9). These studies have been concerned pri- 
marily with the facilitating or inhibiting ef- 
fects of anxiety on performance level. A re- 
lated problem, which has not been specifically 
investigated, is the effect of test anxiety on 
goal-setting behavior. 

This study is primarily concerned with an 
aspect of the relationship between anxiety and 
aspiration level. More exactly, the study in- 
volves a comparison of level of aspiration 
scores (LOA scores) between Ss with high 
test anxiety and low test anxiety over several 
trials on a learning task. The problem may be 


phrased in terms of the null hypothesis, i.e., 
there is no difference between the LOA scores 
of a high test anxiety group and a low test 
anxiety group. 


Method 


Eighty-four students, approximately the 
same number of each sex in two sections of an 
undergraduate course in abnormal psychology 
at the University of Arkansas, were given the 
test anxiety questionnaire developed by Sara- 
son and Mandler (2, 5, 6). The questionnaire 
was given during regular class periods. The 
Ss were told that the purpose was to measure 
the attitude of students at the university to- 
ward taking various types of tests. The in- 
structions, administration, and scoring of the 
questionnaire followed the procedure outlined 
by Sarason and Mandler. The 28 Ss scoring 
in the upper third for this distribution of 
scores were arbitrarily designated as the high 
test anxiety group (Group HA); the 28 Ss 
scoring in the lower third were arbitrarily 
designated as the low test anxiety group 
(Group LA). 


One week later, the Ss were given an ex- 
tended group form of the Wechsler-Bellevue 
Digit Symbol Test (8). The general instruc- 
tions read as follows: 


The task you are about to do is the digit symbol 
subtest of the Wechsler-Bellevue Intelligence Scale. 
This subtest predicts very successfully total IQ when 
it is administered in an individual setting. We are 
interested in secing if it can predict as well when 
administered in a group setting. Therefore, it is of 
particular importance that you listen carefully to the 
instructions and follow them accurately. 

It has been found that people either overestimate 
considerabiy or underestimate considerably their per- 
formance on an ego-involved task such as an intelli- 
gence test. Consequently, you will be asked to pre- 
dict how well you think you will do before taking 
the subtest. 

This study is being repeated in other universities 
throughout the country to see if there are sectional 
differences both in respect to performance on the 
digit symbol test and on ability to predict perform- 
ance. Therefore, make an all-out effort to perform at 
your best. 


The Ss were then presented with a test 
booklet. The front page asked for identifica- 
tion data, such as name, age, etc. Also, space 
was provided for S to record both his estimate 
of his future performance (level of aspiration) 
and his actual performance score (level of 
performance) on each of the four trials he 
received in the test period. The other four 
pages of the booklet were identical. Each page 
contained a copy of the Wechsler-Bellevue 
Digit Symbol Test, increased.in length to 75 
unfilled boxes. “he standardized instructions 
for the subtest were read to the Ss, and the 
sample set of problems completed. The Ss 
were then asked to turn back to the front page 
of the booklet and record the number of boxes 
they thought they could fill in a minute and 
a half. After each trial, the Ss counted their 
number of completions and recorded the total 
beneath their prediction score. They then 


E. Philip Trapp and Donald H. Kausler 


Table 1 
Means and Standard Deviations of Groups HA and LA for LOA and LOP Scores by Trials 


Score M SD M 


M SD 


LOA 
LOP 


$5.21 
53.79 


8.54 
9.35 


LOA 
LOP 


56.43 
56.14 


7.34 
9.45 


57.00 
59.61 


60.43 
63.25 


9.73 
9.44 


62.14 
64.32 


8.63 
9.40 


65.07 8.72 


6.64 
7.74 


66.50 
66.64 


5.95 69.57 6.64 
10.16 


made their estimate for the next trial. Time 
between trials was two minutes. Apparently, 
none of the Ss was aware of any connection 
between the test anxiety questionnaire and 
this testing session. 


Results 


The means and standard deviations of the 
level of aspiration scores (LOA) and the level 
of performance scores (LOP) for Group HA 
and Group LA on successive trials are given 
in Table 1. Since many Ss in both groups 
reached maximum performance on the fourth 
trial, the performance scores for this trial were 
highly skewed and, consequently, have been 
omitted from further analysis. 

The difference between mean LOP scores 
for the two groups on each trial was tested 
for significance by the ¢ test. The obtained 
values of ¢ are given in Table 2. Inspection of 
Table 2 shows that no significant differences 
in mean LOP between the two groups oc- 
curred on any of the trials. As an additional 
measure of performance level, a composite 
mean LOP score was determined for each 
group, and the difference between group 
means was tested by the ¢ test. This score was 
found by averaging the three performance 
scores for each S and then determining the 
mean of these averages. For Group HA, the 
mean was 59.25 and standard deviation was 
9.01. For Group LA, the mean was 62.01 and 
standard deviation was 8.34. The obtained ¢ 
of 1.37 is not significant at the .05 level of 
confidence. 

Either by treating the trials separately or 
in combination, the results support the null 
hypothesis of no real difference in perform- 


ance ability between the two groups on the 
digit symbol test. These findings, incidentally, 
agree, in general, with the results obtained by 
Mandler and Sarason (2) on comparable 
groups also given the Wechsler-Bellevue Digit 
Symbol Test. 

The difference between mean LOA scores 
for the two groups was tested for significance 
on each trial by the ¢ test. The resulting val- 
ues of ¢ are given in Table 2. On the first trial 
there was no statistically significant difference 
in LOA between the two groups. This was the 
only trial in which the Ss expressed their as- 
pirations without prior direct knowledcee of 
their performance level. Hence, at the s' — of 
the testing session there was no real dii!-:c.ace 
in the goal-setting behavior of the two groups. 

With knowledge of results commencing on 
the second trial, an immediate trend can be 
detected which reaches statistical significance 
in the third and fourth trials. The persona’ 
frame of reference for evaluating aspiration 
level based on performance information ap- 
parently had a differential effect on the goal- 


Table 2 


Values of ¢ for Differences Between Group 
Means on LOA and LOP Scores 


Trial 
Score 1 2 3 4 
LOA 56 1.51 
LOP 92 1.55 87 
* Significant at the .05 level. 


32 
Trial 
1 2 3 4 
| 


setting behavior of the two groups. Group LA 
consistently displayed a higher level of aspi- 
ration than Group HA. 

Further evidence demonstrating the tend- 
ency of Group LA to set a higher level of as- 
piration following knowledge of results is re- 
vealed in a comparison of composite group 
means. To obtain these means, the aspiration 
scores of each S for the three trials (second, 
third, and fourth) were averaged and the 
mean of these averages determined for both 
groups. For Group HA, the mean was 61.40 
and the standard deviation was 8.29. For 
Group LA, the mean was 65.50 and the stand- 
ard deviation was 5.63. The obtained ¢ of 2.12 
between the means is significant beyond the 
05 level of confidence. 


Discussion 

The most striking finding in our study was 
the increasing divergence in goal-setting be- 
havior between the two groups on successive 
trials despite correspondingly undifferentiated 
levels of performance and a roughly equiva- 
lent pretest level of aspiration. The differen- 
tial LOA between the two groups reached sta- 
tistical significance on the third and fourth 
trials. This marked tendency of Group LA to 
express higher goal-setting behavior could lead 
to several plausible interpretations. The inter- 
pretation favored by the authors may be 
stated as follows. The high test anxious S, 
while objectively performing no poorer, is 
more condemning and critical of his test be- 
havior than the low test anxious S. In the 
evaluation of his test performance, he then 
tends to overemphasize the negative factors, 
is less optimistic over future performance, and 
expresses this judgment in a relatively lowered 
LOA. As he gains more information or knowl- 
edge of results, his frame of reference becomes 
increasingly slanted in the conservative direc- 
tion. The low test anxious S, on the other 
hand, is subjectively more influenced by his 
success or progress, and is thereby less re- 
strained in his goal-setting behavior. 

It should be noted that both groups func- 
tioned within the realistic range throughout 
the testing. The self-devaluating tendency of 
Group HA did not reach excessive proportions 
because the task was apparently not overly 
frustrating. A more difficult task or even the 


Test Anxiety Level and Goal-Setting Behavior * 33 


same task given under conditions of high 
stress would be predicted to bring on a pro- 
nounced drop in adaptive behavior and a con- 
comitant expression of an unrealistic level of 
aspiration in Ss with high test anxiety. The 
finding that our two groups did not differ sig- 
nificantly in level of performance is consistent 
with other reported studies in which anxiety 
was the independent variable in learning 
tasks. In a number of related studies, the 
Iowa group has shown that high anxious Ss 
perform relatively better on simple tasks and 
low anxious Ss better on complex tasks. The 
digit symbol test, being of medium difficulty, 
should mask any differences between the two 
groups. Both Mandler and Sarason (2) and 
our study confirmed this expectation. 


Summary 


Eighty-four college students were adminis- 
tered the test anxiety questionnaire developed 
by Mandler and Sarason. They were split into 
two extreme groups on the basis of their 
scores on the questionnaire. The first group 
(Group HA) consisted of Ss scoring in the top 
third of the distribution. The second group 
(Group LA) consisted of Ss scoring in the 
bottom third. Both groups were given four 
trials on an extended group form of the 
Wechsler digit.-ymbol subtest. Prior to each 
trial the Ss expressed their level of aspiration 
(LOA), and after each trial they were given 
knowledge of their own results. A comparison 
of mean LOA scores between the two groups 
was made on each trial, with ¢’s of .56, 1.51, 
2.16, and 2.is, respectively. The last two 
trials showed significant differences, with 
Group LA reflecting higher LOA scores. A 
comparison of LOA scores between the two 
groups was made by combining the data on 
the last three trials, with the obtained ¢ of 
2.12 being significant beyond the .05 level of 
confidence. The mean performance scores be- 
tween the two groups were compared for each 
trial, with ¢’s of .92, 1.55, .87, respectively. 
None of the trials showed significant differ- 
ences. When the performance data on all of 
the trials were combined for each group and 
then compared, a ¢ value of 1.37 was obtained, 
which also was not statistically significant. 


Received March 15, 1957. 


34 E. Philip Trapp and Donald H. Kausler 


References 


1. Farber, I. E., & Spence, K. W. Complex learning 
and conditioning as a function of anxiety. J. 
exp. Psychol., 1953, 45, 120-125. 

2. Mandler, G., & Sarason, S. B. A study of anxiety 
and learning. J. abnorm. soc. Psychol., 1952, 
47, 166-173. 

3. Matarazzo, J. D., Ulett, G. A, & Saslow, G. 
Human maze performance as a function of in- 
creasing leveis of anxiety. J. gen. Psychol, 
1955, 53, 79-96. 

4. Montague, E. K. The role of anxiety in serial rote 
learning. J. exp. Psychol., 1953, 45, 91-98. 

5. Sarason, S. B., & Gordon, E. M. The test anxiety 
questionnaire: scoring norms. J. abnorm. soc. 
Psychol., 1953, 48, 447-448. 


6. Sarason, S. B., & Mandler, G. Some correlates of 
test anxiety. J. abnorm. soc. Psychol., 1952, 
47, 810-817. 

7. Spence, K. W., Farber, I. E., & McFann, H. H. 
The relation of anxiety (drive) level to per- 
formance in competitional and noncompeti- 
tional paired associates learning. J. exp. Psy- 
chol., 1956, 52, 296-305. 

8. Wechsler, D. The measurement of adult intelli- 
gence. (3rd ed.) Baltimore: Williams & Wil- 
kins, 1944. 

9. Wesley, E. Perseverative behavior in a concept- 
formation task as a function of manifest anxi- 
ety and rigidity. J. abnorm. soc. Psychol, 
1953, 48, 129-134. 


Journal of Consulting Psychology 
Vol. 22, No. 1, 198 


Further Normative Data on the Progressive Matrices’ 


Gerald Sperrazzo’ and Walter L. Wilkins 


Saint Louis University 


The normative data on the colored Pro- 
gressive Matrices (1) suggest that clinicians 
should be cautious in the application of 
Raven’s original norms, based on Dumfries, 
Scotland children (4), to American children. 
The data from the Rochester, Minnesota, chil- 
dren reported by Green and Ewert (1) were 
in some respects quite different from the Scot- 
tish data and led to the inference that there 
may have been a notable difference in the in- 
telligence level of children in the Scottish and 
Minnesota schools, or that there was a differ- 
ence in the socioeconomic levels of the pupils 
of the normative sample, or that unidenti- 
fied factors Were producing the difference. 
Prompted by the implication that there were 
differences in productivity on the Progressive 
Matrices in the two samples, we tried to 
cross validate the two sets of normative data 
with the testing of St. Louis school children. 
As an additional feature of the study, the se- 
lection of the sample was further arranged to 
allow racial background of the subjects to 
affect the results, if it should be a relevant 
factor. 


Procedure and Subjects 


Except for the gross size of the sample, 
replication of the Rochester sample for age 
and grade level was attempted. From the pub- 
lic schools of St. Louis a sample of 480 chil- 
dren was selected and arranged by grade and 
age as shown in Table 1. Sampling took ac- 
count of the fact that St. Louis, sometimes 
described as a border city, contains schools 
with various compositions of Negro and Cau- 
casian pupils. Approximately one-third of the 
children came from a school with an all white 


1Supported in part by a grant from the Saint 
Louis University Human Relations Center. 
2 Now at the University of Kentucky. 


pupil population, one-third from a school with 
an all Negro population, and one-third from a 
school with 60% Negro and 40% white popu- 
lation. As a partial control of economic fac- 
tors, the occupations of the pupils’ fathers 
were classified within three levels from the 
Dictionary of Occupational Titles here called: 
A, professional or semi-professional; B, skilled, 
semi-skilled, or clerical; and C, service, un- 
skilled, or labor. For each of these three levels, 
180 pupils were chosen. The nature of the 
sample, then, permitted the identification of 
the source of variance from four variables: 
age, sex, race, and socioeconomic status. 

The Progressive Matrices test was adminis- 
tered in both individual and group form. The 
book form of the test was used for individual 
administration with children between the ages 
of seven and eight. For the older children, 
2 x 2 color slides were made of each matrix 
of the test and projected upon a screen to 
groups ranging in size from 15 to 30. The 
group method of administration has been re- 
ported previously and found to be reliable 
with no significant differences between group 
and individual scores (1). 


Results and Discussion 


While the data upon Minnesota children 
suggested that the Scottish norms were low, 
the results of the 1956 St. Louis testing sug- 
gest that they may be closer to the American 
standard than suspected. Scores reported by 
Green and Ewert for Rochester children were 
consistently higher than Raven’s Dumfries 
scores by two to six points throughout the 
range reported. In St. Louis, however, this 
apparent American superiority is greatly less- 
ened. The scores reported in Table 1 show St. 
Louis medians approximating Raven’s me- 


Gerald Sperrazzo and Walter L. Wilkins 


Table 1 


Colored Progressive Matrices Scores for St. Louis Children as Compared with 
Dumfries and Rochester Children 


Age 


Medians 


St.Louis Dumfries Rochester 


7/0-7/5 
7/6-7/11 
8/0-8/5 
8/6-8/11 
9/0-9/5 
9/6-9/11 
01/0-10/5 
10/6-10/11 
11/0-11/5 70 


16 21 
18 21 
24 
25 
26 
27 
28 
29 
30 


dians, with the St. Louis children slightly 
superior at the younger ages and possibly 
slightly inferior at the upper age ranges. 

Inspection of Table 1 indicates that the 
median scores found through this study agree 
more consistently with Raven’s median scores 
than with those from the Rochester study. 
Although the range of difference between the 
median scores of Raven and those of this 
study is from zero to three points, the aver- 
age median score difference is one point. It 
appears, then, that the results found here sup- 
port the validity of Raven’s norms on the 
Dumfries school children for application to 
the St. Louis population. 

Accounting for all the variables which may 
have led to such a notable discrepancy be- 
tween the median scores of the Rochester 
study and the present investigation is rather 
difficult. The socioeconomic status of the 
sample used in Rochester may be relatively 
higher than it would be in Dumfries. If this 
were the case, then an increase in the median 
scores of the Minnesota children would be 
expected since the present study yielded evi- 
dence that the socioeconomic status of the 
subjects can significantly affect the results. 

Table 1 also indicates that the ceiling of 
the test was reached by two age groups. Since 
this has occurred previously in other studies 
(1), it seems rather clear that the ceiling of 
the Progressive Matrices test is apparently 
too low for a common normal pupil popula- 
tion, and a clear differentiation of scores at 
the upper end of the distribution is lacking. 


An analysis of variance was performed to 
study the operation of the age, sex, race, and 
socioeconomic status variables. As a matter 
of convenience the Ss were placed into five 
age groups ranging from seven to eleven years 
of age. Prior to performing the analysis of 
variance, Bartlett’s test of homogeneity was 
applied to the data and no significance was 
found. The results of the analysis are re- 
ported in Table 2. 


Table 2 


Analysis of Variance of Scores from 480 Children on 
Colored Progressive Matrices Test (1947) 


Degrees 
of Mean 


Source of Variation Freedom Square 


Total 

Age 

Sex 

Race 

Socioeconomic (S-E) 
Age X Sex 

Age X Race 

Age X S-E 

Sex X Race 

Sex X S-E 

Race X S-E 

Age X Sex X Race 
Age X Sex X S-E 
Age X Race X S-E 
Sex X Race X S-E 
Race X Sex X Age X S-E 
Within groups 


1193.15 
80.03 
2116.80 
528.26 
30.62 
33.60 
24.65 
21 
62.76 
210.99 
4.32 
13.73 
44.26 
45.01 
51.65 
25.59 


* 01 level of confidence. 


** 0S level of confidence. 


36 
= N Range Mean SD 
4-33 1727 (3.74 
9-30 1831 
9-33 20.06 5.53 
7-34 22.44 5.02 
23.08 5.76 
8-36 23.38 7.87 
10-35 25.20 6.42 
10-34 24.10 9.44 
10-36 27.00 5.48 
46.62° 
3.13 
82.72° 
20.64* 
1.97 
131 
2.45 
8.25° 
1.73 
1.76 
2.02** 


Further Normative Data 


Differences significant beyond the .01 level 
of confidence were found with the age, race, 
and socioeconomic status variables. There was 
no significant difference between the sexes. 
The interaction between race and socioeco- 
nomic status was also significant beyond the 
01 level of confidence, while the third order 
interaction was found to be significant only 
at the .05 level. 

The significance found in this analysis mer- 
its some discussion. The test was built to yield 
scores increasing with the S’s age; therefore, 
the significant differences found here with age 
levels require no further justification. On the 
other hand, the test was not designed to dis- 
criminate racial differences. It is apparent 
from the significant race by socioeconomic 
status interaction and the third-order inter- 
action that a restriction on the interpretation 
of the race difference found is necessary. The 
measured differences in scores between races 
are related to the age, sex, and socioeconomic 
status of the Ss. The results cannot be inter- 
preted, therefore, as showing differences in 
intelligence between the races tested here. The 
differences found seem to depend upon varia- 
tions of the nonrace factors. 

Product-moment correlation coefficients 
were computed between the Progressive Mat- 
rices and three standard intelligence tests, 
utilizing the scores from recent school testing 
of the pupils. With the Otis Quick-Scoring 
Intelligence Test the r was .23 (N of 36), 
with the California Test of Mental Maturity 
it was .30 (N of 137), and with the Kuhl- 
mann-Anderson it was .41 (N of 215). 


Summary and Conclusions 


The Colored Progressive Matrices Test was 
administered to 480 St. Louis school children, 
ranging in age from seven to eleven years. The 
subjects were selected from three St. Louis 
public schools. One was an all-Negro school, 
another an all-white school, and the third en- 
rolled both white and Negro children. Three 
levels of socioeconomic status, based upon the 


on Progressive Matrices 37 
occupation of the father, were established for 
purposes of comparison. The first category 
included professional and semi-professional 
workers; the second, skilled, semi-skilled and 
clerical workers; and the third, laborers, un- 
skilled, and service workers. 

The test was administered individually in 
book form to seven- and eight-year-old chil- 
dren. In the group administration, 2 x 2 color 
slides were projected to 9- to 11-year-old sub- 
jects in groups of 15 to 30. 

The findings lead to the following conclu- 
sions. 

1. The norms established by J. C. Raven 
for the Progressive Matrices appear to be 
valid for application to the population repre- 
sented by the sample in the present study. 

2. There is some indication on the basis of 
the range of scores, that the ceiling of the 
test is too low for a population of average 
children. 

3. Differences significant beyond the .01 
levels of confidence were found between the 
various age levels (seven to eleven), between 
races (white and Negro), and between socio- 
economic levels. There was no significant dif- 
ference between the sexes. 

4. The Progressive Matrices Test showed a 
correlation with the Otis Quick-Scoring test of 
.23, with the California Test of Mental Ma- 
turity, .30, with the Kuhlmann-Anderson, .40. 


Received March 25, 1957. 


References 


1. Green, M. W., & Ewert, Josephine C. Normative 
data on the Progressive Matrices (1947). J. 
consult. Psychol., 1955, 19, 139-142. 

2. Kuhlmann, F., & Anderson, Rose G. Kuhlmann- 
Anderson Intelligence Test. Minneapolis: Edu- 
cational Test Bureau, 1952. 

3. Otis, A. S. Quick Scoring Mental Ability Tests. 
Yonkers, N. ¥.: World Book Co., 1952. 

4. Raven, J. C. Progressive Matrices (1947), Sets A, 
Ab, B. London: H. K. Lewis, 1947. 

5. Sullivan, E. T., Clark, W. W., & Tiegs, E. W. 
California Test of Mental Maturity. Los An- 
geles: California Test Bureau, 1951. 


Journal of Consulting Psychology 
Vol. he 1, 1958 


The Achievement Motive and Field Independence’ 


Jack Wertheim and Sarnoff A. Mednick 


McClelland et al. ascribe the origins of the 
achievement motive to the “different experi- 
ences which children have” (2, p. 288). More 
specifically, they state that “mothers of sons 
with low need achievement tend to demand 
less in the way of independent achievement 
at an early age, and tend to be more restric- 
tive than other mothers” (2, p. 302). An in- 
teresting parallel may be seen in Witkin’s de- 
scription of the developmental precursors of 
field dependence; he states “field dependent 
children have been subjected during growth 
to strongly restrictive parental préssures” (4, 
p. 87). He also points out that these parents 
tend not to demand independent behavior 
from their children. 

The elements of similarity in these writers’ 
descriptions of the developmental origins of 
these two personality variables suggested that 
an individual’s scores on the tests of these fac- 
tors should be related. More particularly, it 
was predicted that there is a positive cor- 
relation between field independence and need 
achievement. 


Method 


Forty-two undergraduate Ss (31 female, 11 male) 
were tested for need achievement (n-Ach) and field 
independence (FI) in that order. Ss took the n-Ach 
test in groups of three each. They were shown four 
slides projected on a screen for 20 sec. each and 
were asked to answer questions about the slide. As 
a representative of the FI test battery, a short form 
(1) of the Embedded Figures Test (EFT) was ad- 
ministered. S was shown a complex geometric figure 
followed by a simple figure and then asked to find 
the simple figure in the complex figure. 

Detailed descriptions of procedure and materials 
relative to these tests may be found elsewhere (1, 


1 The help of Drs. McClelland and Witkin in sup- 
plying materials and assistance is appreciated. 


Harvard University 


38 


Results and Discussion 


The scoring of the n-Ach material was done 
by a trained scorer whose scoring reliability 
has been in the .90’s in previous tests. The 
higher an S’s score, the greater his n-Ach. The 
scores ranged from — 4 to + 13, with a mean 
of 3.4. S’s score on the EFT was his total 
time to locate the embedded figures on all 12 
cards. In this case, the lower S’s score, the 
greater his field independence. The scores 
ranged from 120 to 955 sec., with a mean of 
462.8 sec. 

The product-moment correlation between 
n-Ach and EFT scores was — .40 (p < .01), 
indicating a significant relationship between 
the two variables; the higher the achievement 
motive, the greater the field independence. 

It is interesting to note that the similar- 
ity in the descriptions of the developmental 
origins of these two personality traits is re- 
flected in the observed relationship of the 
variables in adults. 


Received June 17, 1957. 


References 


1. Jackson, D. N. A short form of Witkin’s em- 
bedded figures test. J. abnorm. soc. Psychol., 
1956, 53, 254-255. 

2. McClelland, D. C., Atkinson, J. W., Clark, R. A., 
& Lowell, E. L. The achievement motive. New 
York: Appleton-Century, 1953. 

3. Witkin, H. A. Individual differences in ease of 
perception of embedded figures. J. Pers., 1950, 
19, 1-15. 

4. Witkin, H. A. et al. A study of change in percep- 
tion during psychological development and of 
the relation between changes in mode of per- 
ception and changes in personality function- 
ing. Progress Rep. USPHS Grant No. M-628, 
Coll. of Med., State University of New York, 
1956. 

5. Witkin, H. A. et al. Personality through percep- 
tion. New York: Harper, 1954. 


Journal of Consulting Psychology 
Vol. 22, 1958 


Assimilation, Failure-Avoidance, and Anxiety’ 


David E. Hunt 
Yale University 


and Harold M. Schroder 


Princeton University 


The aims of the present investigation were 
(a) to develop a scale measuring the tendency 
to assimilate negative information and (b) to 
study the relationship of this assimilation 
tendency to other personality measures and 
to behavior in a problem-solving situation. 

In an earlier series of investigations (4) it 
was found that a subject’s (S’s) behavior in 
an actual situation of failure, was related to 
his interpretation of failure as measured by a 
paper-and-pencil “situational interpretation” 
test. In this test, S was confronted by a series 
of hypothetical negative situations (failure or 
criticism) and instructed to select one of two 
alternative interpretations following each situ- 
ation: one alternative represented self-nega- 
tion or decreased self-evaluation (e.g., “This 
shows I might not be so good”) and the other 
alternative represented an opposing interpre- 
tation of self-affirmation without any change 
in self-evaluation (e.g., “This is too difficult 
for anyone to do”’). The latter alternative was 
descriptively regarded as a “failure-avoidant”’ 
interpretation because of its implied refusal to 
accept the possibility of failure. The behav- 
ioral measure involved S’s attempts to solve 


1 The work described in this paper was supported 
in part by a research grant M-955 from the Na- 
tional Institute of Mental Health of the National In- 
stitutes of Health, Public Health Service, and in part 
under Contract Nonr-1858 (12) sponsored by the 
group psychology branch, Office of Naval Research. 
We are greatly indebted to the staff members of the 
participating schools and camps in Connecticut, New 
York, New Jersey, and Pennsylvania for their gener- 
ous cooperation. We would like to express our ap- 
preciation especially to Ronald S. Wilson for his 
help in coding responses and analyzing data, and also 
to John McDavid, Jr. for his assistance in collecting 
data. 


a problem under increasing failure stimula- 
tion. The tendency for S to “leave the field’ 
by selecting a different goal when failing to 
reach the initial goal was regarded as “failure- 
avoidant” behavior. 

Ss utilizing failure-avoidant interpretations 
on the situational test were found to be much 
more likely to utilize failure-avoidant behav- 
ior in the problem-solving situation. In dis- 
cussing this relationship, it was noted that the 
failure-avoidant interpretation process may 
effectively prevent the occurrence of assimila- 
tion; that is, in order to be assimilated, a 
negative event must first be accepted, at least 
provisionally. An individual experiencing a 
failing score or a critical remark must first 
categorize the event as possibly true before 
he can undertake any adaptive modification. 
He must say, in effect, “I may be wrong” be- 
fore he can ask, “How can I correct myself?” 
The provisional acceptance of negative infor- 
mation and the subsequent self-modification 
are both viewed as aspects of the assimilation 
process. 

In previous work we have assumed that 


avoidance and assimilation represent opposing 


poles on a process continuum but have not 
verified this assumption by establishing its 
relationship to other measures. Therefore, in 
the present investigation a direct measure of 
the assimilation tendency was developed in 
order to study the assimilation characteristics, 
both general and specific, of groups distin- 
guished in terms of certain personality and 
behavioral variables. Following leads from 
the findings of other investigators (2), the 
relationship between anxiety and assimilation 
was also explored. The following hypotheses 


40 David E. Hunt and Harold M. Schroder 


concerning general tendencies to assimilate 
negative informations were tested: 

I. Persons who give avoidant interpreta- 
tions score lower in assimilation tendency 
than persons who give nonavoidant interpre- 
tations. 

II. Persons high in anxiety score lower in 
assimilation tendency than persons low in 
anxiety. 

III. Persons who show failure-avoidant be- 
havior score lower in assimilation tendency 
than persons who show nonavoidant behavior. 


Method 


The assimilation scores were derived from 
sentence completion responses given to nega- 
tive stems. Combining the advantages of 
structured materials with freedom of response, 
the Sentence Completion Method (SCM) ap- 
peared most appropriate for present purposes. 
According to the provisions for scoring, two 
individuals may receive the same assimilation 
score by giving quite different responses, es- 
pecially in nonassimilation where the range of 
defensive possibilities such as aggression, de- 
nial, etc., is quite varied. In addition to pro- 
viding an over-all score for general assimila- 
tion tendency, the SCM responses could also 
be later reconsidered to note any specific fea- 
tures of assimilation or nonassimilation which 
might characterize the groups under consid- 
eration. 


Subjects and Procedure 


The SCM was administered to approxi- 
mately 800 white boys between the ages of 
13 and 18 during 1955. Groups of Ss ranging 
in size from five to 80 took the SCM (without 
any time limit), followed by the Situational 
Interpretation Experiment (4) and a short 
verbal intelligence test (6) consisting of 20 
vocabulary items. These 800 Ss will be con- 
sidered according to the following four 
samples: 

Sample A: 500 Ss drawn from YMCA 
camps, Boy Scout camps, public and private 
schools. 

Sample B: 79 Ss attending a parochial 
summer school for purposes of raising their 
grades. 

Sample C: 60 Ss representing the entire 


junior and senior class in a consolidated high 
school. 

Sample D: 160 Ss drawn from sources 
similar to Sample A. 

For certain cross-validation purposes, SCM 
responses of selected Ss from 880 boys tested 
during 1956 were used (Sample E). 


Materials 


The five sentence stems, representing hypo- 
thetical situations of criticism and failure, are 
listed below. The number indicates the posi- 
tion of the stem in relation to the total 30 
stems which each S$ completed. 

Criticism: 

4. When someone criticizes me 

19. If someone says I’m wrong 

29. If someone says I have done poorly 
Failure: 


9. When I fail 
22. When things go wrong 


Development of Assimilation Scale 


Obtaining an assimilation score from five 
SCM responses required three steps: coding 
each response according to an objective sys- 
tem; transforming each coded response into 
an assimilation score; and summing the five 
scores to yield a total assimilation measure. 

Coding system. The system developed for 
coding SCM responses objectively was in- 
tended to minimize the number of categories 
and maximize the amount of information re- 
tained. The categories, given below in the As- 
similation scale, were described in a manual 
with examples for each category. Agreement 
between three judges * in coding 260 responses 
was 93 per cent. In coded form, the responses 
could not only be easily transformed into an 
assimilation scale, but were “recoverable” for 
analysis at a more specific level than as- 
similation. 

Assimilation scale. Each coded response 
was assigned an assimilation score according 
to the following three-point scale, in terms of 
the explicit statement or implied likelihood 
that S would show assimilation or integrated 
modification in his behavior as a result of the 


2 Ronald S. Wilson and Roger C. Porr served as 
judges along with one of the authors, and their 
assistance is gratefully acknowledged. 


Assimilation, Failure-Avoidance, and Anxiety 41 


hypothetical negative event described in the 
sentence stem. 


Scale Score 0: Any form of denial; negative feeling 
or aggression; interaction rejected; pessimistic per- 
ception of interaction; reciprocation; self-disclaimed 
or other attributed causality; cessation of, disruption 
of, or decreased effort in behavior. 

Scale Score 1: Interaction accepted cr questioned; 
optimistic perception of interaction; a.tention to in- 
teraction; conforming reaction to other; cognitive ac- 
tivity; behavior repeated; effort maintained. 

Scale Score 2: Self-attributed causality; behavioral 
modification through either compensation or correc- 
tion; effort increased. 


Transformation from coded responses to as- 
similation scale could be accomplished with 
almost perfect judge agreement. When the 
separate scores for the five SCM items were 
summed, an assimilation scale score (ranging 
from 0 to 10) was obtained for each S. The 
mean assimilation scale score for 107 ran- 
domly selected Ss in Sample A was 4.76, SD, 
2.34. Assimilation scores were not related to 
scores of verbal intelligence. 

Reliability. Two SCM forms were adminis- 
tered to Sample C at a six-month interval. 
Since the forms changed slightly, there were 
only three identical negative stems occurring 
on both forms (“When someone criticizes me,” 
“When I fail,” and “When things go wrong’’). 
Computation of six-point assimilation scores 
based on these three stems for both forms 
yielded a test-retest reliability coefficient of 
.64 (N = 48), which was estimated (by Spear- 
man-Brown prophecy formula) to be .75 if 
based on five, rather than three, stems. 


Identification of Extreme Groups 


Failure-avoidant interpretation. This vari- 
able was defined by the score on the criticism- 
failure subscale of the Situational Interpreta- 
tion Experiment (4). Each of the ten items 
on this subscale presented a hypothetical situ- 
ation of criticism or failure with instructions 
for S to select one of the two alternative 
choices that would best indicate the thought 
he would have in that situation. The total 
number of self-negating or “nonavoidant” 
choices was added to obtain S’s score so that 
a low score indicated a greater tendency to 
make avoidant interpretations. From Sample 
A, a failure-avoidant interpretation group 
(scoring 0 to 2) of 19 Ss and a nonavoidant 


interpretation group (scoring 10) of 31 Ss 
were selected (see Table 3 in reference 4 for 
distribution of scores). Elimination of one S 
from each group because of an incomplete 
SCM left 18 and 30 Ss, respectively, or ap- 
proximately the lower four per cent and the 
upper six per cent of the distribution. 
Manifest enxiety. This variable was defined 
by the score on the Taylor Manifest Anxiety 
Scale (5). From Sample 8, a low anxiety 
group (scoring 2 to 8) of 11 Ss and a high 
anxiety group (scoring 28 and above) of 11 
Ss were selected representing the lower and 
upper 13 per cent of the distribution. 
Failure-avoidant behavior. This variable 
was defined as the extent to which S avoids 
a situation in which he is receiving gradually 
increasing failure by selecting a different goal 
rather than continuing to strive toward the 
same goal (4). From Sample C, a behavioral 
failure-avoidant group of 14 Ss who avoided 
the situation at the earliest signs of failure 
and a behavioral nonavoidant group of 14 Ss 
who remained in the situation through all 12 
trials were selected, representing the upper 
and lower 29 per cent of the distribution. 


Results 
Assimilation Scale 


Hypothesis I. Failure-avoidant interpreta- 
tion. As indicated in Table 1, the mean as- 
similation score for the avoidant interpreta- 
tion group is significantly lower (< .01) than 


Table 1 
Assimilation Score Comparisons of Extreme Groups 


Assimilation 
Score 
Mean 


Hypoth- 
esis 


Extreme 
Group 


I Failure-avoidant 3.06 
interpretation 
Nonavoidant 


interpretation 


5.17 


High anxiety 


3.80 20 <.01 


Failure-avoidant 
behavior 

Nonavoidant 
behavior 


161 26 <.10 


* By one-tailed test. 


4.01 
5.94 


42 David E. Hunt and Harold M. Schroder 


the mean for the nonavoidant group. Extreme 
groups on this interpretation measure were 
selected from Sample D in an attempt at rep- 
lication. In this sample, the mean of 21 avoid- 
ant Ss scoring 0-3 (lower 9 per cent) was 
lower, but not significantly so (p < .10, one- 
tailed test), than the mean of the 27 non- 
avoidant Ss scoring 9-10 (upper 18 per cent) 
(¢ = 1.55, df = 46). A nonsignificant Pear- 
sonian r of .12 was obtained between these 
two measures on 107 randomly selected Ss, 
indicating that whatever relationship exists is 
limited to extreme groups. 

Hypothesis II. Anxiety. The mean assimi- 

-lation score for the high in manifest anxiety 
group is significantly lower (< .01) than the 
mean for the group low in anxiety. This in- 
verse relationship is also supported by signifi- 
cant correlations in the samples: r= — .32 
for 79 Ss in Sample B (< .01) and r = — .28 
for 32 Ss in Sample A for whom anxiety scores 
were available (< .05). 

Hypothesis Failure-avoidant behavior. 
The hypothesis concerning avoidant behavior 
was not supported, although the difference be- 
tween assimilation scores was in the predicted 
direction (p < .10). A test of this hypothesis 
in another sample also failed to confirm it. 


Specific Assimilation Characteristics of 
Extreme Groups 


The coded SCM responses were examined 
for more specific characteristics of assimila- 


tion or nonassimilation in each group, and two 
SCM code categories, denial and aggression, 
were identified as potentially differentiating. 
The denial code included the negation of feel- 
ing (“I don’t mind”) and negation of aggres- 
sion (“I don’t get mad”’) as well as denial of 
attention to the source of negative informa- 
tion (“I ignore them”). The aggression code 
included explicitly hostile responses (e.g., “I 
get mad,” “I blow up”). These measures were 
recorded simply in terms of occurrence or 
nonoccurrence on any one of the five SCM 
stems. Incidence of denial and aggression was 
17 and 14 per cent, respectively, for 107 ran- 
domly selected Ss. 

In addition to these two SCM measures, an 
index of the incidence of self-correction was 
also available for the extreme groups differen- 
tiated with respect to failure-avoidant behav- 
ior. A situation of academic failure was de- 
scribed and S was instructed to imagine that 
he was in the situation and then write down 
his first reaction. Self-correction was coded 
for responses such as “Find out what I did 
wrong” or “Find out my mistake,” and oc- 
curred for 21 per cent of Ss in Sample C. 

Failure-avoidant interpretation. The avoid- ~ 
ant interpretation group gave significantly 
more (< .01) denial responses than the non- 
avoidant interpretation group, but did not 
differ in aggression. Extreme groups were se- 
lected from Sample E to attempt a replication 


Table 2 
Specific Response Comparisons of Extreme Groups 


Extreme Group N 


Incidence of 
Seli-correction 
on First-reaction 


Incidence of 
Aggression 
on SCM 


Incidence of 
Denial 
on SCM 


Failure-avoidant interpretation 18 
Nonavoidant interpretation 30 
Corrected x? 


High anxiety 11 
Low anxiety il 
Corrected x? 


Failure-avoidant behavior 14 
Nonavoidant behavior 14 
Corrected x? 


39 1 Not available 
13 Not available 


0 


18 72 
-18 


Not available 
Not available 


21 
21 


rw « square computed by actual number of Ss rather than by proportions. 
< OS. 
*> < 


4.57*' 
28° 07 
14 57 
~ 6.22° 


Assimilation, Failure-Avoidance, and Anxiety 43 


of this defense-specific finding, but no differ- 
ence in denial was found. 

Anxiety. Although no difference in denial 
is noted between anxiety groups, the high 
anxious group gave significantly more (<.05) 
aggressive responses than the low anxious 
group. From 245 Ss in Sample E with anxiety 
scores available, the upper and lower 13 per 
cent were selected, and this anxiety-aggression 
relationship was successfully replicated (cor- 
rected x? = 4.22; df= 1: p= < .05). 

Failure-avoidant behavior. The behavioral 
groups differed only in self-correction, this re- 
sponse occurring significantly more often (< 
.05) in the nonavoidant behavioral group than 
in the avoidant group. First-reaction measures 
were not available in other groups for repli- 
cation. It might be noted that self-correction 
did not occur with sufficient frequency on the 
SCM to provide a basis for discriminating 
between groups. 


Discussion 


We have assumed earlier (4) that assimila- 
tion aud failure-avoidant interpretation rep- 
resent opposing poles on a continuum of re- 
actions to negative events. However, this as- 
sumption is called into question by the failure 
to replicate Hypothesis I (p < .10). This 
nonsignificant difference may have resulted 
partially from the fact that the groups used 
for replication were less extreme in their in- 
terpretation scores (0-3 vs. 9-10) than the 
initial extreme groups. As we have discussed 
elsewhere (4, p. 19), the interpretation test 
was designed primarily to identify Ss extreme 
in their interpretation tendencies rather than 
to measure a continuously distributed charac- 
teristic. Therefore, scores in the middle range 
(even as low as 3) are accordingly imprecise. 
Therefore, rather than inferring an avoidance- 
assimilation continuum, it seems safer to 
conclude that Ss who are extremely avoid- 
ant will have a very low tendency to assimi- 
late negative information. That is, if we use 
the mean assimilation score of 107 randomly 
selected Ss (4.76) as an estimate of the as- 
similation base rate, the avoidant interpreta- 
tion mean score (3.06) is significantly lower 
than this group base rate (¢ = 2.90, p < .01). 

Similar to the present constructs of assimi- 
lation and avoidance, though used in a differ- 


ent context, are the processes of assimilation 
and contrast recently described by Hovland, 
Harvey, and Sherif (1) to account for individ- 
ual variation in attitude change studies. Using 
communication statements as stimuli rather 
than the hypothetical negative events pres- 
ently employed, these authors demonstrated 
that Ss taking an extreme stand on a given 
issue have narrower “latitudes of acceptance” 
for statements which differ from their own on 
an issue. When exposed to such differing state- 
ments, these Ss with extreme attitudes were 
found to reject the communication by con- 
trasting it to their own position rather than 
by assimilating the communication. In a dif- 
ferent stimulus realm, our avoidant interpre- 
tation Ss may be regarded as “contrasting” 
negative information by their use of some 
nonassimilative maneuver, such as denial, 
which may represent a narrow range of ac- 
ceptance. 

Although the greater incidence of denial in 
avoidant interpretation Ss would be congruent 
with the assumed underlying blame-avoidant 
tendency, this defense specific relationship 
must be regarded as highly tentative since 
replication was unsuccessful. It seems likely 
that the denial category requires more precise 
response referents and more careful attention 
to situational (or stem) specific determinants. 

On the basis of the successful replications 
in both instances, the tendency for anxious Ss 
to be “low assimilators’” and, more specifi- 
cally, to give more aggressive responses ap- 
pears reasonably well established for the pres- 
ent population. That the high anxious Ss do 
not give more denial responses than the low 
anxious Ss is not surprising when one con- 
siders that denial would very likely prevent 
the admission of anxiety in the testing situa- 
tion. The greater incidence of aggressive re- 
sponses, but not denial, in high anxious Ss 
agrees with earlier findings employing two 
similar instruments. In reporting correlates of 
the Test Anxiety Questionnaire, Sarason and 
Mandler (3) noted a significant correlation 
between test anxiety and the aggression sub- 
scale of the Child-Waterhouse Interfering 
Tendency Questionnaire (7) (r = .29; N= 
62; p < .05). However, in considering the de- 
fensive subscale which from its definition— 
“need to justify or rationalize away one’s fail- 


| 


44 David E. Hunt and Harold M. Schroder 


ures” (7, p. 301)—appears to be similar to 
denial, the correlation to test anxiety did not 
approach significance. 

The present results do not permit anchoring 
the assimilation tendency in behavior, al- 
though the assimilation scale differences be- 
tween extreme behavioral groups were in the 
hypothesized direction. Although data were 
not available for replication, the greater in- 
cidence of a specific form of assimilation— 
self correction—in nonavoidant behavioral Ss 
appears quite reasonable. That is, these Ss 
(who may be regarded as “persistent”) were 
selected on the basis of their remaining in the 
problem situation utilizing failure cues as in- 
formation for developing new solutions, a be- 
havior which would presumably require some 
process of self-modification. 

Primarily because of its free response char- 
acteristic which allows the recovery of re- 
sponses for more specific analysis such as is 
summarized in Table 2, the SCM, particularly 
in its highly structured present form, appears 
to provide a potentially valuable method for 
the initial exploration of problems in this area. 


Summary 

A scale measuring the tendency to assimi- 
late negative information was developed, 
based on SCM responses of 800 white boys 
between the ages of 13 and 18. Assimilation 
scores were obtained from responses to five 
SCM stems structured as negative situations 
(e.g., “When someone criticizes me . . .”) by 
three steps: (a) coding each response accord- 
ing to an objective system (inter-judge agree- 
ment = 93%); (5) transforming each coded 
response into an assimilation scale score (a 
three-point scale based on the likelihood that 
S would integrate the negative event into sub- 
sequent behavior); and (c) summing these 
five scores. Corrected test-retest reliability for 
the assimilation score over six months = .75. 

Subgroups of Ss representing extremes on 
three other measures were identified in order 
to test assimilation hypotheses: Taylor anxi- 


ety scale; failure-avoidant interpretation meas- 
ure (tendency to avoid interpreting negative 
events as failure); and failure-avoidant be- 
havior. The results were as follows: 

1. Ss high in anxiety scored significanti, 
lower in assimilation tendency and gave sig- 
nificantly more aggressive responses than Ss 
low in anxiety. 

2. Ss who showed nonavoidant behavior 
gave significantly more self-correction re- 
sponse than Ss showing failure-avoidant be- 
havior. 

3. Ss who gave avoidant interpretations 
scored lower in assimilation tendency than Ss 
giving nonavoidant interpretations, but this 
difference was not successfully replicated. 

These findings were considered in relation- 
ship to earlier assumptions regarding assimi- 
lation and avoidance; to other similar con- 
structs (assimilation and contrast); and to 
other similar measures (Interfering Tendency 
Questionnaire). The potential value of the 
SCM for investigating problems in this area 
was noted. 


Received March 20, 1957. 


References 


1. Hovland, C. L, Harvey, O. J., & Sherif, M. As- 
similation and contrast effects in reactions to 
attitude change. J. abnorm. soc. Psychol., 1957, 
55, 245-253. 

2. Mandler, G., & Sarason, S. B. A study of anxiety 
and learning. J. abnorm. soc. Psychol., 1952, 
47, 166-173. 

3. Sarason, S. B., & Mandler, G. Some correlates of 
test anxiety. J. abnorm. soc. Psychol., 1952, 
47, 810-817. 

4. Schroder, H. M., & Hunt, D. E. Failure-avoidance 
in situational interpretation and problem solv- 
ing. Psychol. Monogr., 1957, 71, No. 3 (Whole 
No. 432). 

5. Taylor, Janet A. A personality scale of manifest 
anxiety. J. abnorm. soc. Psychol., 1953, 48, 
285-290. 

6. Thorndike, R. L., & Gallup, G. H. Verbal intelli- 
gence of the American adult. J. gen. Psychol, 
1944, 30, 75-85., 

7. Waterhouse, I. K., & Child, I. L. Frustration and 
the quality of performance: III. An experi- 
mental study. J. Pers., 1953, 21, 293-311. 


Journal of Consulting Psychology 
Vol. 22, No. 1, 1958 


Q-Sort Correlation: Stability and Random Choice 
of Statements 


Arnold H. Hilden 
VA Regional Office, St. Louis and Washington University 


The problem of the reliability of correla- 
tions obtained on Q sorts does not appear to 
have been systematically investigated. The 
very nature of Q sorts would seem to pre- 
clude the possibility of determining reliability 
through the usual approaches of alternate 
forms, matched items, and split halves. In a 
typical Q-sort experiment, an individual indi- 
cates the degree to which each item of a group 
describes or corresponds to the object or value 
in question. For example, in a self sort, he 
would sort statements according to how he 
felt they described him; in an ideal sort, he 
would sort them as he would like to be. Since 
there is no advance knowledge of his concep- 
tions of himself, it would be impossible to con- 
struct alternate forms, to match statements, 
or to split a group of statements into halves 
of equal significance for each individua! for a 
given frame of reference. The object toward 
which attention is being directed would differ 
uniquely for eack person, since each is evalu- 
ating himself. 

It seems an inescapable conclusion that no 
solution exists. Perhaps the restriction im- 
posed by adherence to the usual methods cre- 
ates the problem. Reliability is a matter of 
high importance. It reflects consistency of 
measurement and stability of performance. If 
we are not sure of the degree of consistency 
of measurement, we are faced with ambiguity 
in evaluating the changes which our instru- 
ments may reflect. The criterion of success as 
to reliability is the correlation between results 
obtained on two or more performances with 
an instrument or its comparable form. The 
method proposed here results in a high de- 
gree of consistency among 20 alternate forms 
chosen entirely on a random basis, wiih no 


45 


prior evaluation of a single item of any of the 
20 forms. The author has found no evidence 
in the literature of the use of this approach 
as a means of achieving consistency of meas- 
urement or stability of performance. Since the 
method proposed is not one of those usually 
employed, the question might be raised as to 
whether this is a measure of reliability. The 
question might also be raised as to whether 
the concept of reliability should be broadened 
to include the method used here. 


-, A Universe of Personal Concepts 


Any method based on matching or weight- 
ing items offers a most serious difficulty. Pre- 
determination of values of individual items 
for Q sort does not seem feasible. A different 
approach seems indicated, one which could re- 
sult in comparability of results by virtue of 
the method by which groups as a whole 
are selected. One possible approach would be 
based on sampling theory. Unbiased samples 
of statements from a clearly defined universe 
should yield results which are comparable be- 
tween the groups so selected. The question, 
then, is the definition of the universe. The 
number of universes from which a selection 
could be made is incalculable. The first con- 
sideration would appear to be the broad na- 
ture of the universe, that is, would it involve 
any particular dimensions such as person- 
ality traits, or concepts of accepted clinical 
importance, or the formulations of some theo- 
retical approach to personality? Whether or 
not there is merit in the avoidance of par- 
ticular dimensions characterizing a universe of 
statements depends in part upon the use for 
which the universe is intended. Perhaps of 
greater importance is the difficulty of defining 


46 Arnold H. Hilden 


any universe with specific dimensions so that 
all possible members of it may be accounted 
for and identified. It is not implied that a uni- 
verse can be established without any dimen- 
sions. At least one is necessary in order to 
establish a boundary within which all mem- 
bers would lie. The dimension chosen was 
that of difficulty. The universe consists of 
statements based on words not above the 
sixth-grade level. It would seem unwise to 
confront the majority of the population, par- 
ticularly the psychological deviates, with 
statements employing words beyond the sixth- 
grade level. The use of easy words is not 
merely a practical consideration; it is essen- 
tial, to reduce to a minimum the distortions 
which might affect results through lack of 
understanding. No matter how technically 
correct a formulation of human reaction might 
be, if the ability of the subject to understand 
it be disregarded, the effect may vary from 
person to person, and to a degree difficult to 
estimate. To repeat, all of the words upon 
which the statements comprising the universe 
are based are at the sixth-grade level of diffi- 
culty or below. 

The members of a universe of difficulty may 
be identified on an objective basis through 
the use of Thorndike and Lorge’s The Teach- 
er’s Word Book of 30,000 Words (4). Part I 
of this volume contains the 19,440 words 
which were found to occur at least once per 
million words in current popular reading. The 
frequency of each word is indicated by a num- 
ber or letter. Accepting their statement re- 
garding translation of frequency of occur- 
rence into grade level (4, p. xi), words with 
a frequency of ten or more per million may 
be taken as being at the sixth-grade level or 
below. Thus, 6,092 words are so classified: at 
or below the sixth-grade level of difficulty. 
The next step is to select the words which are 
to constitute the basis for the universe. 

As stated above, selection did not involve 
particular theories of personality, nor accepted 
clinical terminology, nor specific variables as 
such. In order to be as inclusive as possible, 
determination of statements included in the 
universe was made on the following basis. 
Any word was accepted for the universe if 
the meanings given in the Thorndike Century 
Senior Dictionary (3) could readily be used 


to formulate a statement of human reaction, 
behavior, feeling, thinking, etc. At the present 
stage of endeavor, it does not seem best to 
state the basis more specifically. Such a for- 
mulation is designated a personal concept. 
Whenever a dictionary definition provided a 
formulation of a personal concept, it was ac- 
cepted for inclusion in the universe. In some 
instances, a difficulty presented itself in that 
many meanings were given for some words; 
for example, 40 meanings were given for the 
word “set.” The selection of meanings used in 
this dictionary wes stated to be based on 
Lorge’s English Semantic Count (3, p. vii), 
in which the frequency of each meaning was 
counted. It was the use of a semantic fre- 
quency count which prompted the use of this 
dictionary in the present method. The order 
of arrangement of the meanings of a word 
was such that common and general meanings 
were put before rare and special meanings 
and present-day meanings put before archaic 
meanings. Thus, in cases where several mean- 
ings were listed, a consistent and relatively 
objective basis was provided for selection of 
the meaning to be used. Where more than one 
definition could be used as a basis for a per- 
sonal concept, the one selected was the first 
one in the order listed. This procedure tended 


to bring into the formulations comprising the ~ 


universe the meanings most frequently em- 
ployed and likely to be most easily under- 
stood. 


As an example, the words which were indicated to 
occur 10 or more times per million words on Page 1, 
Column 1, of the The Teacher's Word Book of 30,000 
Words (4) were: abandon, abbey, abbot, abide, 
ability, able, aboard, abode, abolish. Following the 
procedure indicated above, three of these were used: 
abandon, abide, able. The respective formulations 
made on the basis of these words were: “I yield 
completely to my feelings,” “I stand firm on my 
ideas,” “I am an able person.” Selecting another page 
at random, the words indicated on Page 94, Column 
1, were: improve, improvement, impulse, in. Two of 
these were used: improve and impulse, for the fol- 
lowing formulations: “I seek to improve myself,” 
and “I act on impulse.” 


Following this procedure throughout the en- 
tire dictionary, a total of 1575 statements 
were formulated. The results of the first en- 
deavor were set aside for a period of weeks, 
then carefully reviewed in their entirety to 


Q-Sort Correlation 47 


obtain a fresh approach. After another pe- 
riod of several weeks, they were reviewed a 
third time. In the compilation which consti- 
tutes the Universe of Personal Concepts 
(UPC) (2), these statements were printed in 
serial order, each page (but the last) having 
50 statements, for a tote! of 32 pages. The 
statements on each page were numbered from 
01 to 50, and the pages from 01 to 32. For 
example, the 20th statement on Page 3 was 
numbered 0320, “I trust people”; and the 
35th statement on Page 6 was numbered 
0635, “I am a coward.” 


Random Sets of Personal Concepts 


With a Universe of Personal Concepts at 
hand, it was a simple matter, with random 
decimal digit tables (5), to obtain random 
sets of statements from the UPC, which have 
been designated Random Sets of Personal 
Concepts (RSPC). By careful adherence to 
the principles of random selection, some de- 
gree of confidence may be placed in each 
RSPC as a representation of the preferred 
meanings in the entire dictionary, within a 
given level of simplicity or difficulty. Within 
the limits of experimental error, some degree 
of confidence may also be felt that results ob- 
tained on one set of RSPC will agree with re- 
sults obtained on any other RSPC and with 
the results on the UPC. Furthermore, the ex- 
perimenter is now in a position to determine 
the limits of error of measurement. 

Random Sets of Personal Concepts were 
selected from the UPC through the use of a 
decimal digit table (5). Selecting a point at 
random to enter the table, and moving hori- 
zontally to the right, four digits were taken at 
a time. The first pair of digits indicated the 
page of the UPC; the second pair, the num- 
ber of the statement on that page. At each 
drawing, therefore, each statement in the 
UPC had an equal opportunity to be selected. 
This procedure was repeated until 50 state- 
ments were drawn, the number which had 
been decided upon for each RSPC. Twenty 
RSPC were drawn. The same personal con- 
cept could appear and did appear in different 
RSPC. On one occasion the same statement 
was drawn twice in succession for the same 
RSPC; the second drawing was ignored. Each 
statement was printed on a card 1” X 24”, 


with a code number corresponding to its serial 
position in the RSPC, to facilitate recording. 


Testing Sampling Reliability 


In its broader sense, reliability is not to be 
conceived of simply as the correlation be- 
tween two instruments or the correlation be- 
tween two applications of the same instru- 
ment. The concept of reliability is embraced 
in a most fundamental sense when there is 
consistency between unbiased samples from a 
clearly defined universe and between these 
samples and the universe. Seldom is the in- 
vestigator in a position to define the bounda- 
ries of a universe clearly and to identify all 
the members of the universe so that each has 
an equal chance of being drawn in any selec- 
tion for a sample. When that situation does 
exist, the experimenter is in a position to de- 
termine parameters, that is, relationships in- 
volving the universe, in addition to statistics, 
that is, relationships involving samples and 
the universe. Stated in general terms, then, 
the hypothesis is advanced that the correla- 
tions obtained on Q sorts with RSPC will 
agree with the correlations obtained on Q 
sorts with the UPC. 


Procedure 


The subjects in this experiment were four 
male graduate students in psychology at Wash- 
ington University." 

The correlations involved three kinds of re- 
lationships on each RSPC: (a) r’s between 
self concept and ideal concept (intra-indi- 
vidual, S-I), (5) r’s between self sorts of 
each individual and every other individual 
(inter-individual, S-S), and (c) r’s between 
ideal sort of each individual and every other 
individual (inter-individual, I-I). 

Each subject was required to do a self sort 
and an ideal sort on each of the 20 RSPC 
and on the UPC. Cards were sorted into nine 
piles, with categories ranging from “most like” 
to “most unlike.” For the self sort, the cate- 


1 Indebtedness is acknowledged to B. Buzzotta, J. 
Nolte, D. Petrovich, and R. Soskin, for their interest 
and sustained effort. Among the conditions required 
for a valid test of the hypothesis would be one of 
optimal maintenance of interest and attention to 
such a task. In Table 1, there is no relation between 
the order of their names and the order of the subjects. 


48 


gories at the extremes were “most like you” 
and “most unlike you”; for the ideal sort, 
“most like you want to be” and “most unlike 
you want to be.” 

It was postulatea that the degrees of rele- 
vance or pertinence for any person for any 
frame of reference are normally distributed 
for the UPC. For nine categories of distinc- 
tion, the distribution of proportions of the 
theoretical normal curve should be as fol- 
lows: .0401, .0655, .1210, .1747, .1974, .1747, 
1210, .0655, and .0401. Translating these 
proportions for the UPC in terms of whole 
numbers, the 1575 statements should be 
sorted into the following groups: 63, 103, 
191, 275, 311, 275, 191, 103, and 63. The 
distribution used in this experiment conformed 
to these proportions. Since each RSPC was 
randomly drawn, the distribution of degrees 
of relevance for each is thereby considered 
normal. The distribution used in sorting each 
RSPC was therefore determined to be: 2, 3, 
6, 9, 10, 9, 6, 3, and 2 (1). 

The order in which the sorts were done was 
as follows: (a) all 20 self sorts on the RSPC, 


Arnold H. Hilden 


(5) all 20 ideal sorts on the RSPC, (c) the 
self sort on the UPC, and (d) the ideal sort 
on the UPC. After completing the 20 sorts in 
the first series, it seemed almost certain that 
no one could remember the sorting of a par- 
ticular RSPC at the time of the second series. 
The subjects were certain they had no mem- 
ory of any particular previous sort. The im- 
portance of memory in an experiment such as 
this is unknown, but it would seem safe to 
say that any effect from memory would be of 
negligible significance. After completion of the 
20 ideal sorts, the two sorts on the UPC were 
done in the order stated. The execution of a Q 
sort constituted an assignment of weighted 
scores by the subject, ranging from nine at 
the “most like” to one at the “most unlike” 
end of the categories. 

The daia obtained from all of the sorts 
referred to in the preceding paragraph fur- 
nished the basis for the 336 intercorrelations 
in Table 1. At this point it may be well to 
clarify in detail what was involved in the cal- 
culation of a correlation coefficient, so that 
the reader may know what was correlated 


Table 1 


Q Sort r’s Obtained on 20 RSPC and the UPC on Four Subjects, Based on Intra-individual Comparisons 
(Self—ideal) and All Inter-individual Comparisons (Self-self and Ideal-ideal) 


Subject and Sort 


Random A-A B-B C-C D-D A-B A-C A-D B-C B-D C-D A-B A-C A-D B-C B-D C-D_ Random 
Set Si Si SI SI SS SS SS SS SS FI HH Fi Set 
00 4 92 .73 63 .76 52 75 3s ST 71 6S .80 00 
o1 80 87 46 49 82 5S 69 80 .78 78 74 80 o1 
02 73 81 66 52 64 82 SS £2 .71 16 80 02 
03 75 81 50 91 32 61 56 68 76 03 
04 84 80 40 90 s7 35 51 75 313%. 80 84 O04 
os 8S 80 77 71 38 64 57 67 36 73 TT 87 os 
06 8s 7 54 51 35 «63 61 77 a «89 06 
07 82 87 | 70 S473 67 74 2 ae 85 07 
08 8 54 60 6S .84 08 
09 90 62. AT 43 77 7 66 84 09 
10 83 SS 70 64 48 60 66 64 80 86 .76 10 
il 84 .78 «SA 8S 44 860 4 68 66 55 il 
13 69 .80 47 SS 62 54 .76 69 63 .76 70 81 13 
14 59 51 72 70 80 76 86 14 
15 77 71 66 80 69 5S8 69 67 62 EL 67 76 =6.70 76 79 15 
16 83 60 59 65 48 62 72) «(67 77 83 16 
17 77 79 47 64 Al 65 54 .80 77 £4 64 81 77 17 
18 82 .78 49 71 56 $2 69 61 60 64 71 83 18 
19 770 733 80 8.39 41 56 .56 46 55 70 69 82 61 57 79 19 

Mean r 81 .76 $2 81 4S «SS 68 58 72 as. av 74 74 82 

SD, 06 .08 07 07 09 09 07 os 10 07 O7 04 

UPCr 76 56 59 60 72 69 69 638 71 83 


self-ideal sort. 


Note.—A, B, C and D indicate the four subjects; S indicates self sort; I indicates ideal sort. For example: A-A indicates 
S-I 


Q-Sort Correlation 49 


with what. The subjects are designated by 
letters A, B, C, and D, and the Q sorts by S 
(for self sort) and I (for ideal sort). The 
RSPC were numbered from 00 through 19. 
Consider Subject A’s self sort on RSPC 00. 
He sorted the 50 items into nine piles, which 
in effect meant he had given weights or 
“seores” ranging from one to nine to each 
item. When Subject B performed a self sort 
on RSPC 00, he did the same. Thus, each of 
the 50 items in RSPC were “scored” by the 
two subjects. These 50 pairs of numcrical 
values provided the basis for the r of .73 
found at the top of Column 5, under A-B, 
S-S. Again, when Subject A performed an 
ideal sort on RSPC, he “scored” the 50 items 
in that set a second time; the 50 pairs of 
numerical values yielded the r of .78 found 
at the top of the first column in Table 1, un- 
der the heading A—A, S-I. Each subject was 
compared with himself in terms of self sort 
vs. ideal sort on each of the 20 RSPC, result- 
ing in 80 r’s for the four subjects. Each sub- 
ject was also compared with each of the others 
in terms of their self sorts, on each of the 20 
RSPC, yielding 120 r’s in all. Finally, each 
subject was compared with each of the others 
in terms of their ideal sorts on each of the 20 
RSPC, furnishing another 120 r’s. 

In the first column in Table 1, under A-A, 
S-I, are listed the r’s obtained from Subject A 
on the 20 RSPC, reflecting the agreement be- 
tween his self concept and his ideal concept. 
The mean value and SD of the 20 r’s were 
calculated. The mean r’s and SDs were also 
determined for subjects B, C, and D. The six 
possible intercomparisons for self sort and 
ideal sort were determined in the same man- 
ner. The 16 parameters were obtained from 
the self sorts and ideal sorts on the UPC. It 
was the purpose of this experiment to deter- 
mine the degree of agreement between the 
statistical values on the RSPC and the para- 
metric values on the UPC. 

As a null hypothesis, it would be stated 
that no difference exists between the mean r’s 
obtained on the RSPC and the r’s obtained 
on the UPC. 


Results 


From Table 1 it can be seen that not one 
of the differences between the mean r’s ob- 


tained on the RSPC and the r’s obtained on 
the UPC is statistically significant. Even at a 
10% level of confidence, the null hypothesis 
that no difference exists would not be rejected. 
The degree of agreement may also be ex- 
pressed in terms of the correlation between 
the mean RSPC r’s and the UPC r’s, r = .94. 


Discussion 


This study is offered as a methodological 
contribution: the establishment of a universe 
of statements drawn up in systematic and in- 
clusive fashion, as a supply from which state- 
ments may be drawn for use in Q-sort studies. 
It has been demonstrated that statistical 
values for the random samples are in close 
agreement with the parametric values for the 
universe. The agreement held between persons 
for self sorts, between persons for ideal sorts, 
and within persons for self-ideal sorts. The 
experimenter is thus in position to determine 
with greater confidence the stability of Q-sort 
correlations for a particular individual. One 
need not necessarily rely on results from only 
one set of statements. Results from one set 
may be compared with results from any other 
set of statements. Agreement between results 
on the independently drawn RSPC is a basis 
for verification. If it is desired to see whether 
or not change has occurred (for example, 
change in a patient in psychotherapy), the 
same or another RSPC may be used. The use 
of two or more RSPC yields independent sets 
of measures to determine agreement as to ex- 
tent and direction of change, if any. Thus, 
increased confidence may be felt that such 
changes as are reflected are meaningful in 
nature. 

With regard to possible overlap of items in 
the 20 RSPC and the UPC, no actual count 
was made. It seems reasonable to assume that 
at least 600 items in the UPC were not in- 
cluded in any RSPC. 

It may be permissible to consider an anal- 
ogy in the field of physical measurement. If 
a random sample of adults were to he meas- 
ured for height and weight, the correlation 
between the values would agree, within the 
limits of experimental error, with the corre- 
lations on other random samples and with the 
correlation on the entire supply from which 
the random samples were drawn. In executing 


50 Arnold H. Hilden 


two sorts, such as self and ideal, on a given 
RSPC, a subject would assign pairs of values 
to a set of randomly drawn statements. It 
should be expected that the values he would 
assign on random sets of statements would 
correspond with values he would assign on the 
universe fr»m which the unbiased samples 
were drawn. 


Summary 


Reliability of Q-sort correlations cannot be 
determined through alternate forms involving 
equated items. A new approach involves se- 
lecting groups of statements rather than 
matching items. A universe of 1575 Personal 
Concepts (UPC) was drawn, using every 
word in the dictionary suitable for formulat- 
ing human reaction at the sixth-grade level or 
below. Using random digits, Random Sets of 
Personal Concepts (RSPC) were drawn: 20 
sets of 50 items. A null hypothesis was pro- 
posed: self-ideal correlations on the 20 RSPC 
would not differ from the parametric values 


on the UPC. Four graduate students per- 
formed the experiment. The hypothesis was 
not rejected (10% level). It was concluded 
that results on random sets agree with results 
on a universe which is clearly defined, easily 
understood, and consistent in frame of ref- 
erence. 


Received April 11, 1957. 


References 


1. Hilden, A. H. Manual for Q sort and random sets 
of personal concepts. Webster Groves 19, Mo. 
(628 Clark Ave.): Author, 1954. 

2. Hilden, A. H. Universe of personal concepts. Web- 
ster Groves 19, Mo. (628 Clark Ave.): Author, 
1954. 

3. Thorndike, E. L. Thorndike century senior dic- 
tionary. New York: Appleton-Century, 1941. 

4. Thorndike, E. L., & Lorge, I. The teacher's word 
book of 30,000 words. New York: Bur of 
Publ., Teachers Coll., Columbia Univer., 1944. 

5. U. S. Interstate Commerce Commission. Table of 
105,000 random decimal digits. Washington: 
Government Printing Office, 1949. 


ournal o, Psychology 
22, 


Manifest Anxiety in Prisoners Before and After CO, 


Dell Lebo, Robert A. Toal,’ 


Richmond Professional Institute 


and Harry Brick 


Virginia State Penitentiary 


The Taylor Manifest Anxiety Scale (MAS) 
(18, 19) has figured prominently in many 
recent examinations of the phenomena of 
anxiety and learning. These studies gener- 
ally have yielded findings of significance and 
theoretical importance. Hence, the reliability 
and validity of the scale as a measure of mani- 
fest anxiety deserve examination. 

The reliability of several versions of the 
MAS, under various conditions, has ranged 
from .68 for anxiety scores from the complete 
Minnesota Multiphasic Personality Inventory 
(MMPI) and the Biographical Inventory 
(19) to .95 for the original 225-item form 
and a form consisting only of 50 critical items 
(14). A review of the literature (12) has re- 
ported a figure of .96 as the upper limit of 
this range. Most frequently, retest reliabilities 
are above .80, even after a lapse of seventeen 
months (10). Considering the diversity of 
forms marshalled under the aegis of the MAS 
with dissimilar buffer items (6, 9, 18, 19), 
changes in the number of critical items (9, 11, 
18, 19), substitutions in the use of particular 
critical items (5, 21), and modifications of 
the vocabulary and sentence structure of the 
critical items (4, 19), these figures are re- 
assuring. 

However, the picture is not so conclusive 
when the aspect of validity is examined. When 
the MAS was first constructed it was not 
validated against a criterion of manifest anxi- 
ety external to the scale itself. Instead, items 
were selected on the basis of subsiantial agree- 
ment by five clinicians on 65 of 200 or so 
MMPI items (18). Later, 50 items showing 


1 Now at the Medical College of Virginia. 


the greatest relationship to total anxiety 
scores became the node of the scale (17). 

As Taylor has recently pointed out, “The 
construction of the test was not aimed at de- 
veloping a clinically useful test which would 
diagnose anxiety, but rather was designed 
solely to select Ss differing in general drive 
level” (20, p. 303). Hence, many studies em- 
ploying the MAS have not investigated anxi- 
ety per se, but rather the effect of drive on 
performance. A tidal wave of MAS studies 
verifying hypotheses from Hullian theory has 
appeared. In all of these studies anxiety has 
been defined in a thoroughly operational man- 
ner, i.e., in terms of MAS scores only. Such a 
procedure is sound methodologically but lim- 
ited clinically. The need for validation against 
external ratings of anxiety has been assuaged 
with a spate of articles. 

It has seemed to the present writers that an 
ideal way of demonstrating the validity of the 
MAS largely has been overlooked. This 
method is identical with that for validating 
a projective test. It consists of measuring, 
manipulating, and remeasuring an independ- 
ent variable, in this case anxiety, to determine 
whether observed differences in the magnitude 
of the variable correlate with measured differ- 
ences in MAS scale scores. 

Only one study seems to have approached 
this technique. Gallagher (9) hypothesized 
that students undergoing client-centered ther- 
apy would show a significant decrease in 
anxiety upon the termination of therapy. He 
used four measures of anxiety derived from 
the MMPI, including the MAS. All the meas- 
ures showed a significant decrease from before 
therapy to after. He made a point of distin- 


i 


52 Dell Lebo, Robert A. Toal, and Harry Brick 


guishing between anxiety used as a theoretical 
motivating factor, “present by inference rather 
than by direct observation,” and anxiety stress 
which is “a descriptive term for a syndrome 
of physiological and psychological tensions” 
(9, p. 443). 

Although Gallagher was concerned with 
anxiety stress, the measure he found most 
adequate, the MAS, has been used to demar- 
cate both types of anxiety. Hence, the present 
investigators would have appreciated some in- 
dication of physiological and psychological 
tension in the subjects of his study other than 
that obtained by MMPI items and the fact 
that the college students presented themselves 
for help in personal adjustment. What seems 
to be missing, from the point of view of a 
validity study, is the direct observation of 
anxiety stress. 

It seems germane to study anxiety in a 
more stressful situation than college life, in a 
situation where the alleviation of anxiety 
would be a major factor and not a “subgoal” 
(9, p. 444) of therapy. The present investi- 
gation used as subjects persons in a life situa- 
tion exposing them to severe loss, if not to 
danger or injury. 


Problem 


The purpose of the present study was an 
attempt to validate the MAS directly by ap- 
plying it to anxious subjects in a stress situa- 
tion. The anxiety of these subjects was then 
deliberately manipulated in that certain of 
them underwent carbon dioxide therapy 
(CO,) to alleviate their anxiety, while others 
were not treated therapeutically. It was hy- 
pothesized that as those persons receiving 
CO, had their directly observable anxiety 
(Gallagher’s anxiety stress) reduced there 
would be a concomitant reduction in MAS 
score. Subjects not receiving CO, would not 
undergo any reduction in MAS score. 


Method 
Subjects 


All Ss were selected from a penitentiary 
with virtually all kinds of inmates whose sen- 
tences ranged from a comparatively short pe- 
riod to life imprisonment. Inmates appearing 


in the daily referral to the prison psychiatrist 
were screened for suitability. 


Only persons psychiatrically diagnosed as suffering 
from an emotional disorder with predominant symp- 
toms of manifest anxiety were eligible to become Ss. 
The Ss selected were directly observed to be more 
than normally apprehensive. They presented a wide 
range of anxiety symptoms. Chief among these were: 
free-floating anxiety, periodic anxiety attacks, poor 
concentration, frightening dreams, sleeplessness, ano- 
rexia nervosa, fearfulness, irritability, general rest- 
lessness, and tension headaches. One 5, who suffered 
from tension headaches and whose MAS score was 
comparatively low, asked an examiner to inspect his 
responses and tell him if they were those of a nor- 
mal man. Other Ss presented a more labile picture. 
Several Ss feared attacks of panic. One indicated this 
fear by saying, “I feel like my nerves are about to 
overtake me.” Another S requested treatment because 
he was in constant dread of being attacked by other 
prisoners. Although this man was well liked, his fears 
were aggravated by a murder which occurred in the 
prison. One S was lost to the study because of an 
anxiety attack while participating in the investiga- 
tion. While walking down a corridor smoking a 
cigarette he was seen by a guard. To call the prison- 
er’s attention to his infraction of a rule, the guard 
called out, “Hey you!” The prisoner began shaking, 
went into a severe anxiety attack, and immediately 
had to be hospitalized. In addition to such symp- 
tomatology, in nearly all cases the inmates’ prison, 
social, or work adjustment was seriously impaired 
as a result of their anxiety. 


The Ss were not, then, given the MAS and 
demarcated as anxious on the basis of their 
test scores. Instead, they were prison inmates 
whose anxiety already was diagnosed psychi- 
atrically as sufficiently serious to warrant 
therapeutic intervention. Twenty-four highly 
selected inmates were divided into treatment 
and control groups. There were 8 white Ss 
and 4 Negroes in the control group and 7 
whites and 5 Negroes in the experimental 
group. The Ss were randomly assigned to 
either the experimental or control group in 
nearly all cases. It was not practicable, for 
example, to assign a potentially violent pa- 
tient to the control group. Every effort was 
made to avoid assigning patients to the con- 
trol group merely on the basis of a less press- 
ing need for treatment. 

The groups were closely matched with re- 
gard to length of institutionalization, age, 
and IQ. Statistically, the means for the two 
samples arose by sampling from the same 
population. 


Manifest Anxiety in Prisoners Before and After CO, 53 


Procedure 


A pretest interview was conducted with 
each S before testing to assess the S’s reaction 
to the test situation with particular attention 
to their test anxiety. Ss who reacted in a hos- 
tile or suspicious manner were eliminated from 
the present investigation. The Ss receiving 
CO, were tested within the week prior to the 
beginning of treatment and received their 
second testing session within the week after 
treatment was concluded. The time between 
the testings of the control group was identical. 

Both groups continued to participate fully 
in institutional living. Nothing was done to 
alter directly the stressful environment of 
either group. CO, was administered in a 
matter of minutes and the inmate was re- 
turned to face the stresses of prison life. Thus, 
the sociopsychological effects of treatment 
were minimized. 


Psychological Tests 


The pre- and posttherapy tests consisted of 
the critical items of the MAS, as listed by 
Windle (21) with the exception of MMPI 
item 442, and the Bender-Gestalt test (B-G) 
(2). Because of suggestions that the MAS 
may be “especially susceptible to subject’s 
conscious or unconscious motivation to pre- 
sent a favorable picture of themselves” (6, 
p. 277), another measure, less susceptible to 
faking seemed desirable. A posttherapy de- 
crease in MAS scores should reflect a more 
efficient approach to reality. Pascal and Sut- 
tell (16) claimed that the B-G test is a bit of 
reality with which S has to cope. The result- 
ing performance, therefore, may be a measure 
of “an attitude toward reality” (16, p. 8). 
Basowitz et al. (1), supported this contention. 
They found that normal, anxiety-prone Ss in 
a stress situation, i.e., paratrooper training, 
gave less efficient performances to tachisto- 
scopically presented B-G designs than did a 
control group. Since it is claimed that “simu- 
lation is impossible” (2, p. 152) on the B-G, 
it seemed an ideal companion, especially when 
objectively scored, for the “susceptible” MAS. 


Carbon Dioxide 


Carbon dioxide seems to be an especially 
effective somatic therapy in the treatment of 


anxiety (3, 13, 15). The present investigation 
employed Meduna’s (15) slow coma tech- 
nique, with a mixture of 30% CO, and 70% 
O,. At no time was pure CO, ever introduced 
into the breathing bag. The number of treat- 
ments received by the experimental group 
ranged from 14 to 22 with a mean of 19.2. 
Treatments were discontinued upon manifest 
amelioration of anxiety, ie., the Ss presenting 
complaints were no longer directly observable 
and psychiatric diagnosis indicated that ther- 
apy could be discontinued. The control group 
received no treatment. 

In his review of the literature on CO,, 
Frank (8) expounded two methods of control 
in CO, experiments: (a) an untreated control 
group and (6) a treated control group to 
be given either an N,O-O, mixture or O, 
“spiked” with CO,. The second suggestion is 
neither as logical nor as simple as it first ap- 
pears; for CO, is more frequently reacted to 
as a noxious stimulus than is N,O. A “pla- 
cebo” of N,O could be easily distinguished 
from a treatment of CO,. Spiking with CO, 
may also be unsound, for the central nervous 
system is highly selective in its takeup of CO,,. 
Consequently, a much higher concentration of 
CO, may exist in the blood than the small 
amount used to spike the gas would suggest. 
Such a concentration could not be regarded 
as a placebo. 

Because of such considerations, the control 
group was untreated but continued to live in 
the same stressful environment as the treat- 
ment group. 


Statistical Treatment 


The data of the present experiment, com- 
ing as they did from a sample selected for 
symptoms of manifest anxiety, could be ex- 
pected to be markedly skewed. The scoring 
of the B-G also distributed in a moderately 
skewed manner in the population on which 
the test was standardized (16). Hence, Fes- 
tinger’s (7) distribution-free method of anal- 
ysis was employed. 


Recults and Discussion 


Median scores for the experimental and 
control groups of the MAS and the B-G are 
shown in Table 1. The Festinger (7) rank 


Deli Lebo, Robert A. Toal, and Harry Brick 


Table 1 


Median Scores of Experimental and Control 
Groups on the MAS and the B-G 


Experimental Group 


Pretest Posttest 


Control Group 


Test Pretest Posttest 


MAS 
B-G 


28.5 
30.5 


19.0 
22.0 


22.0 
20.5 


22.5 
33.0 


difference comparisons of the two groups are 
listed in Table 2. 

As can be seen from Table 1, both the MAS 
and B-G scores for the experimental group de- 
creased following CO,. MAS values for the 
control group remained approximately the 
same, suggesting the environment they orig- 
inally found to be stressful had remained 
relatively unchanged. The B-G scores of this 
group, however, markedly increased. This dif- 
ference in scores is interesting, but because of 
the limited number of Ss the present writers 
will not indulge in public post hoc speculation. 

Table 2 reveals that the MAS scores de- 
creased in a statistically significant manner 
for the group whose anxiety was treated by 
CO,. There was no such decrease for the un- 
treated anxious group. These people, pre- 
sumably, continued to operate in a stressful 
milieu. The hypothesized reduction in MAS 
score was realized. 

The MAS may be said to reflect differences 
in the magnitude of the variable of manifest 
anxiety. Hence, it was found to be a valid 
measure of manifest anxiety. 

It should be remembered that the B-G test 
was included to check any tendency on the 


Table 2 


Festinger Comparisons of Experimental and Control! 
Groups on the MAS and the B-G 


Mean Rank 


Experi- 
mental 
Group 


Control 


Test Group 


MAS 
B-G 


9.08 
9.42 


15.92 
15.58 


* Significant at the .05 level. 


part of the experimental group to fake MAS 
responses. The improvement of the treatment 
group was Significant also in their B-G scores. 
As MAS scores decreased B-G performance 
improved. This improvement in coping with 
the reality of the B-G test suggested that the 
MAS was not faked by the experimental 
group. 

An association implicit in the present study 
is that the observation of manifest anxiety has 
a relationship to MAS scores. Several studies 
specifically relating MAS scores and observa- 
tional data have appeared. In none of these 
studies has the relationship between observed 
anxiety, observed diminution of anxiety, and 
MAS scores been investigated. Certainly the 
high relevance of this relationship, quantified 
and made explicit, earmarks it as a desidera- 
tum for future research. 


Summary 


Distribution-free comparisons of a group of 
24 manifestly anxious prisoners were made. 
Half of the men received CO, and half were 
given no treatment. Reasons for the absence 
of a placebo control were advanced. A statis- 
tically significant improvement in the per- 
formance of the experimental group on the 
MAS was obtained. This improvement was 
also seen in a check test, the B-G. The results 
were interpreted as indicating the validity of 
the Taylor MAS as a measure of manifest 
anxiety. 


Received March 18, 1957. 


References 


1. Basowitz, H., Persky, H., Korchin, S. J. & 
Grinker, R. R. Anxiety and stress. New York: 
McGraw-Hill, 1955. 

. Bender, Lauretta. A visual motor gestalt test and 
its clinical use. New York: Amer. Orthopsy- 
chiat. Ass., 1953. 

. Brick, H. Carbon dioxide therapy in private 
practice and in the Virginia State Penitentiary. 
Tri-State Med. J., 1956, 3, 14-17. 

. Buss, A. H. A follow-up item analysis of the 
Taylor anxiety scale. J. clin. Psychol., 1955, 
11, 409-410. 

. Cook, W. E. A study of anxiety in the Minne- 
sota Multiphasic Personality Inventory. Va. J. 
Sci., 1953, 4, 28. 

. Davids, A. Relations among several objective 
measures of anxiety under different condi- 
tions of motivation. J. consult. Psychol., 1955, 
19, 275-279. 


Manifest Anxiety in Prisoners 


. Festinger, L. The’ significance of the difference 
between means without reference to the fre- 
quency distribution function. Psychometrika, 
*1946, 11, 97-105. 

. Frank, J. A. A critical evaluation of carbon di- 
oxide oxygen inhalation therapy in mental dis- 
orders. Amer. J. Psychiat., 1953, 110, 93-103. 
. Gallagher, J. J. Manifest anxiety changes con- 
comitant with client-centered therapy. J. con- 
sult. Psychol., 1953, 17, 443-446. 

. Hedlund, J. L., Farber, I. E., & Bechtoldt, H. P. 
Normative characteristics of the Manifest 
Anxiety Scale. Paper read at Midwest. Psy- 
chol. Ass., Chicago, April, 1951. 

. Holtzmann, W. H., Calvin, A. D., & Bitterman, 
M. E. New evidence for the validity of Tay- 
lor’s Manifest Anxiety Scale. J. abnorm. Soc. 
Psychol., 1952, 47, 853-854. 

. Kendall, E. The validity of Taylor’s Manifest 
Anxiety Scale. J. consult. Psychol. 1954, 18, 
429-432. 

. LaVerne, A. A., & Herman, M. An evaluation of 
carbon dioxide therapy. Amer. J. Psychiat., 
1955, 112, 107-113. 


14. 


Before and After CO, 


55 


Matarazzo, J. D., Guze, S. B., & Matarazzo, Ruth 
G. An approach to the validity of the Taylor 
anxiety scale: scores of medical and psychiatric 
patients. J. abnorm. soc. Psychol., 1955, 51, 
276-280. 


. Meduna, L. J. Carbon dioxide therapy. Spring- 


field, Ill.: Thomas, 1950. 


. Pascal, G. R., & Suttell, Barbara, J. The Bender- 


Gestalt test. New York: .Grune & Stratton, 
1951. 


. Spence, K. W., & Taylor, Janet A. Anxiety and 


strength of the UCS as determiners of the 
amount of eyelid conditioning. J. exp. Psy- 
chol., 1951, 42, 183-188. 


. Taylor, Janet A. The relationship of anxiety to 


the conditioned eyelid response. J. exp. Psy- 
chol., 1951, 41, 81-92. 


. Taylor, Janet A. A personality scale of manifest 


anxiety. J. abnorm. soc. Psychol. 1953, 48, 
285-290. 


. Taylor, Janet A. Drive theory and manifest 


anxiety. Psychol. Bull., 1956, 53, 303-320. 


. Windle, C. The relationships among five MMPI 


“anxiety” indices. J. consult. Psychol., 1955, 
19, 61-63. 


| 
8 

1 


Journal of Consulting Psychol. 
Vol. 22, No. 1, 195 - 


Bender-Gestalt Test Correlates of 
Emotional Depression’ 


John E. Tucker and Mimi J. Spielberg 
Mental Hygiene Clinic, VA Hospital, Albany, N. Y. 


The purpose of this study was to discrimi- 
nate between depressed and nondepressed 
clinical patients by means of the Bender- 
Gestalt given in diagnostic test batteries. 

Thirty-six male Ss were divided into two 
groups, depressed and nondepressed. The dif- 
ferences in age and intelligence between the 
two groups were not statistically significant. 
The sample included diagnoses of psychosis, 
psychoneurosis, and character disorder, and 
excluded diagnoses of organic brain damage. 

Trained raters determined the presence or 
absence of depression by a standardized 
evaluation of the Ss’ VA clinical records and 
psychological test materials, excluding the 
Bender. Acceptable agreement between raters 
was demonstrated in the differentiation of the 
two groups. A 27-item check list was devised 
for use as a guide and summary in judging 
depression. 

The Bender scoring system used was that 
of Pascal and Suttell, a 107-item check list. 
The identity of the patient was concealed. 

Comparisons were made between the de- 
pressed and nondepressed Ss on 20 separate 
Bender test item scores. Not all the items se- 
lected for comparison were equally relevant 
clinically. Of the 20, only two items, tremor 


1An extended report of this study may be ob- 
tained without charge from John E. Tucker, Mental 
Hygiene Clinic, VA Hospital, Albany, N. Y., or for 
a fee from the American Documentation Institute. 
Order Document No. 5430, remitting $1.25 for micro- 
film or $1.25 for photocopies. 


56 


and distortion, were found to be significant at 
the .05 level. No items were significant at the 
01 level. 

It can be concluded that Ss sometimes ex- 
press depression through a tremor on the 
Bender Gestalt. Tremor alone is not sufficient 
to diagnose depression, but it perhaps can be 
used to warn a Clinician of that possibility. 
Tremor was found primarily among the psy- 
choneurotic Ss. 

The greater distortion was found in the de- 
pressed group, and the Ss who distorted the 
designs were primarily psychoneurotics. No 
distortion was found in the nondepressed 
group. This finding seems to contradict the 
hypothesis of pedantic accuracy usually asso- 
ciated with depressed behavior. Since this re- 
sult does not appear to be clinically meaning- 
ful, it may be an artifact of the sample. Other 
items usually thought to be related to depres- 
sion, such as compression of drawing space, 
were nondiscriminating. The Bender test, as 
scored by Pascal’s system, does not seem to 
offer clinical usefulness in detecting emotional 
depression in typical cases in our clinic. 

Comparisons were made of average initial 
reaction time and average total response time 
to each card. It is interesting to note that the 
tendencies were not in the expected direction. 
The nondepressed Ss were slower in the 
Bender response times than the depressed pa- 
tients. 

Brief Report. 
Received October 25, 1957. 


Journal of Consulting Psychology 
Vol. 22, No. 1, 1988 


Assaultiveness and Two Types of Rorschach 
Color Responses’ 


Robert Sommer and Dorothy Twente Sommer 
The Saskatchewan Hospital, Weyburn 


Much of the thinking about the relation- 
ship between Rorschach responses and other 
behaviors is based on the concept that the 
formal aspects of the responses (determi- 
nants) should have correlates in other be- 
haviors of the person (introversion, anxiety, 
etc.). However, present-day thinking tends to 
regard information as to structural aspects of 
personality alone (e.g., defense mechanisms) 
as insufficient for understanding the behavior 
of the individual. Past thinking was consid- 
ered unsatisfactory in that it looked for rela- 
tionships between energy, affects, or need and 
overt behavior without taking into account the 
channels the organism had developed for han- 
dling this energy. It was also felt that infor- 
mation as to these structures alone left one 
with a bare shell and still unable to predict 
the actual behavior of the person. What was 
needed was information regarding both the 
contents of fantasy and the person’s tech- 
niques for handling his affects. 

At least two approaches to Rorschach re- 
search are unsatisfactory: (a) to relate de- 
terminant scores to behavior without regard 
to the content of the responses; * (0b) to re- 
late content categories (e.g., destructive, oral 
expulsive, etc.) to behavior without regard to 


1The study was supported by a grant from the 
State of Louisiana hospital research fund. Without 
the active cooperation of the staffs of Gulfport VA 
Hospital (especially Herdis A. Deabler) and of South- 
east Louisiana Hospital (especially Joseph G. Daw- 
son), the study could not have been undertaken. 

2With some determinants there is a confounding 
of content and structure, as in M where the use of a 
human figure is usually required. Since this is a com- 
mon but not necessary condition for M, it makes the 
meaning of any correlation between M and other 
factors quite ambiguous. 


the formal aspects of the response (determi- 
nant score). 

An attempt is made in the present study to 
base predictions on a combination of formal 
and content elements. The type of behavior 
selected for study is assaultive or explosive 
behavior where the Rorschach literature is 
both considerable (3, 4, 6, 7) and often con- 
tradictory. There have been several attempts 
to design experiments that would assess the 
merits of the hypotheses proposed, but the re- 
sults are not easy to understand. With regard 
to structural correlates of explosive behavior, 
there are those who believe that either very 
many color responses or no color responses are 
indicative of it. With regard to the content of 
the responses, predictions can be both very 
specific (“tomahawk” to the white space on 
Card VII indicates aggressive acting out) or 
very general (many animal responses indi- 
cates stereotyped thinking). In the prediction 
of aggressive behavior from the Rorschach, 
there are two distinct groups of clinicians: 
(a) Some believe aggressive contents indicate 
that the person is able to express his hostile 
impulses in a socially acceptable way and 
hence has no need to act out. Phillips and 
Smith (5), for example, speak of blood re- 
sponses as contraindicative of destructive act- 
ing out. (b) Others believe that the use of 
hostile content indicates a preoccupation with 
such thoughts. Depending upon the individu- 
al’s impulse-control system, such behavior 
should be expected if the sample of the in- 
dividual’s life-history is adequate. Few tests 
have been based on the first approach except 
that of Stone (7) which met with essentially 
negative results. Those of Pitluck (6) and 
Finney (1) may be taken to support the lat- 


58 Robert Sommer and Dorothy T. Sommer 


ter view, although in the former case the re- 
sults were not clear-cut. 

The present problem is to determine the 
relationship between explosive behavior and 
two types of Rorschach color responses: (a) 
aggressive and explosive responses (volcano, 
fire, blood from a wound, etc.); (5) nonag- 
gressive responses (bouquet, ice-cream, orchid, 
etc.). One essential condition is that the two 
groups should not differ in Sum C; the only 
difference should be in the content of the color 
responses. Our prediction, based on the work 
of Stone, Pitluck, and Finney, is that the “ag- 
gressive content” group will show more ex- 
plosive and assaultive behavior than the “non- 
aggressive content” group. It was also decided 
to distinguish between physical explosiveness 
and assaultiveness (defined as explosive physi- 
cal attacks on persons, objects, or animals in 
the environment) and verbal explosiveness and 
assaultiveness (explosive episodes where tor- 
rents of abuse are directed toward the outside 
world). Both definitions stipulate that the be- 
haviors be characterized by an explosive qual- 
ity in addition to being assaults upon the en- 
vironment. 

An ancillary hypothesis was that “aggres- 
sive movement responses” (e.g., fighting, kick- 
ing, etc.), when accompanying aggressive color 
responses, would improve the prediction of ag- 
gressive acting-out behavior. However, the 
chief emphasis of the study was on the dif- 
ferences in behavior between Ss who gave 
“aggressive color responses” and those who 
gave “nonaggressive color responses.” 


Procedure 


Our preference for criteria of assaultiveness 
was for meaningful acts by the individual in 
his own particular world. This type of ma- 
terial is most easily found by interviewing or 
in good case histories if they happen to be 
available. Yet if these are gathered for pur- 
poses other than for one’s experiment (e.g., 
hospital records), one must always be aware 
tnat the interviewer may not have been inter- 
ested in the behavior that one is studying. If 
the behavior studied is of considerable impor- 
tance in the individual's relations with others, 
as is the case with assaultive behavior, the 
chances of the interviewer’s overlooking it are 


minimal, although the possibility must still 
be considered. 

Hence, our criteria of assaultiveness are in- 
cidents of such behavior in case histories 
gathered routinely by the social service de- 
partment of a VA hospital. The procedure 
limited our sample to male psychiatric pa- 
tients which is regrettable in terms of gen- 
eralizing from the results but advantageous 
if aggressive acting-out is more prevalent in 
such a group. 

Essentially, the method embodied an ex 
post facto design in which two investigators 
worked independently. The senior author 
spent several weeks going through the files of 
the psychology department of this hospital 
and looked over approximately 200 protocols 
(none of which he had seen before). He wrote 
down the names of those patients who had 
given explosive color responses (objectively 
defined as responses of volcanoes, explosions, 
fire, etc.) and those who had given exclusively 
nonaggressive color responses (orchids, bou- 
quets, sherbert, etc.) and also whether or not 
the M responses could be classified as aggres- 
sive or not (objectively defined as actions of 
fighting, kicking, etc. in a way similar to Fin- 
ney’s category of “active destruction” in his 
Palo Alto scale). If a person with aggressive 
color responses also had several nonaggres- 
sive color responses, he was still included in 
the aggressive color category. If a subject had 
even six or seven nonaggressive color responses 
and one that was partially aggressive, he was 
not included in the nonaggressive color group. 

The list of approximately 65 names, ar- 
ranged in alphabetical order and with all other 
information removed, was given to the second 
investigator who was instructed to inspect the 
case history of the patient and rate the pre- 
hospital behavior along a seven-point scale for 
(a) physical assaultiveness and (5) verbal 
assaultiveness. If there was insufficient infor- 
mation for rating along a dimension, a zero 
could be used. Some attrition was expected in 
our sample as a few case histories had been 
sent to other installations or could not other- 
wise be located. 


Results 


The final data consisted of the ratings of 
the case histories of 26 aggressive color Ss 


Assaultiveness and Rorschach Color Responses 


Table i 
Ratings on Explosiveness and Assaultiveness 


Measure and Group N* 


Mean 


A. Physical explosiveness 
Aggressive color Ss 22 
Nonaggressive color Ss 26 


B. Verba! explosiveness 
Aggressive color Ss 23 
Nonaggressive color Ss 28 


C. Physical and verbal explosiveness 
Aggressive color Ss 22 
Nonaggressive color Ss 25 


3.41 
2.54 


3.14 
2.93 


3.78 
3.07 


2.45 
2.22 


7.14 
5.52 


10.79 


10.33 1.67 05 


* The case histories of some Ss did not contain sufficient information for rating along all dimensions. Hence, the figures in 


the“cells may vary slightly depending on the dimension compared. 


and 31 nonaggressive color Ss. The ratings on 
both physical and verbal assaultiveness ranged 
over the full scale (from 1, very low, to 7, 
very high). The mean rating of physical as- 
saultiveness for the total population was 2.98 
while for verbal assaultiveness it was 3.39. 
The mean Sum C for the aggressive color 
group was 3.83, while for the nonaggressive 
color group it was 3.60. Neither of these dif- 
ferences are significant. 

The data which test the hypothesis that ag- 
gressive color Ss exceed the nonaggressive 
color Ss in ratings on assaultiveness are pre- 
sented in Table 1. 

For physical assaultiveness, the aggressive 
color Ss significantly exceed the nonaggres- 
sive color Ss. The results on verbal assaultive- 
ness are suggestive but fall slightly short of 
significance. If the ratings for the two types of 
assaultiveness are pooled to form a rating on 
general assaultiveness, the difference between 
the groups is significant at the .05 level. The 


Table 2 
Percentage of Cases Predicted 


Rorschach Rating 


Non- 
aggressive 
Color Ss 


Assaultiveness 
Rating 


Aggressive 
Color Ss 


Above median 59 40 
Below median 41 60 


biserial correlation between the type of color 
response given by the S and the pooled rat- 
ings of aggressiveness is .35 (p < .01). 

Table 2 presents the percentage of correct 
and incorrect predictions when the assaultive- 
ness ratings are divided at the median. It 
shows that although the results are in the ex- 
pected direction, it would be hazardous to de- 
pend on this variable alone in making pre- 
dictions about the potentiality for assaultive 
behavior in individual cases. 

Some interesting results appear when the 
protocols are divided on the basis of Ss who 
gave aggressive, nonaggressive, and no move- 
ment responses. An overall rating of the move- 
ment responses had been made by the senior 
author at the time of classification of color 
responses. If there was doubt as to whether 
the tone was aggressive or nonaggressive, the 
cases were placed in a separate category and 
not included in the comparisons in Table 3. 

Although the number of cases is necessarily 
smaller than in the preceding table, the dif- 
ferences are even more marked when based 
on type of C response and type of M response. 
In terms of the predictive value of a combina- 
tion of movements to color responses, the 
biserial r for a comparison of the “aggressive 
color and aggressive or no movement Ss” with 
the “nonaggressive color and nonaggressive or 
no movement Ss” for their hostility ratings is 
55 (p< .01). There are not enough cases 
where aggressive color Ss gave clearly aggres- 
sive M responses to use either of these cate- 
gories in the comparisons. 


59 
= See 
= 


Robert Sommer and Dorothy T. Sommer 


Table 3 
Ratings on Explosiveness and Assaultiveness with Ss Classified by Type of C and M Responses 


Measure and Group 


N Mean t p 


A. Physical explosiveness 
Aggressive C + Aggressive M 
Nonaggressive C + Nonaggressive M 


Aggressive C with no M 
Nonaggressive C + Nonaggressive M 


Aggressive C + Aggressive M 
Nonaggressive C with no M 


Aggressive C with no M 
Nonaggressive C with no M 

B. Verbal explosiveness 
Aggressive C + Aggressive M 
Nonaggressive C + Nonaggressive M 


Aggressive C with no M 
Nonaggressive C with Nonaggressive \f 


Aggressive C + Aggressive M 
Nonaggressive C with no MW 


Aggressive C with no M 
Nonaggressive C with no Mf 


4.00 
2.67 


4.00 
2.67 


4.00 
1.75 


4.00 
1.75 


4.13 
3.31 


4.17 
3.31 


4.13 
2.22 


4.17 
2.22 


Although the two groups of Ss had approxi- 
mately the same Sum C and M totals, the 
Sum C of the aggressive color Ss contained 
more pure C responses than that of the non- 
aggressive color group, which in turn ex- 
ceeded the former group in number of FC 
responses. The CF responses were approxi- 
mately the same for the two groups. This re- 
sult was hardly unexpected as most of the ag- 
gressive color responses could be scored C or 
CF, while “flowers” or “grasshoppers” gener- 
ally rate scores of FC or at most, CF. It 
seemed in order to see if a division by deter- 
minant alone (FC versus CF +C) would 
prove to be of more predictive utility with 
the ratings on assaultiveness than the use of 
content regardless of form-color balance. Fin- 
ney found that his groups of aggressive and 
nonaggressive Ss differed with regard to their 
CF + C:FC ratios. 

Of our sample, 39 Ss had CF + C exceed- 
ing FC, while 11 had FC equalling or greater 
than CF + C. Analysis of these data showed 
that there was not a significant difference be- 


tween the two groups on either physical or 
verbal assaultiveness. 


Discussion 

There were two theories about aggressive 
Rorschach color responses outlined at the be- 
ginning of this paper. The first was that such 
responses constituted a draining off of hostile 
impulses and we should find less acting-out in 
such Ss. This would postulate a concept of a 
level of unreality where the person can solve 
his problems without fear of retribution from 
a hostile environment. If he is able to do this, 
he will have no need to act-out in a socially 
unacceptable fashion. There is an implicit as- 
sumption here of a specific amount of psychic 
energy that the person can release either on a 
fantasy level or on a behavioral level or on 
both. 

Two groups of empirical studies support 
this view. The first would be instances where 
tension released in overt behavior reduces the 
tension in the fantasy level. A case in point 


60 

2.46 05 

144 NS. 


Assaultiveness and Rorschach Color Responses 61 


would be the Zeigarnik effect experiments 
where the act of completing certain tasks will 
reduce the psychic tension as measured by 
memorial effects. There are also the studies 
carried out at Clark University where physi- 
cal restraint increased the number of M re- 
sponses given on the Rorschach. 

Conversely, there are those who believe that 
the use of a content area indicates a concern 
or even preoccupation with a given topic. 
Hence, we should certainly expect behavioral 
correlates if we look hard enough. This view 
implies that the Rorschach gives a sample of 
behavior in a particular situation (involving 
an interpersonal relationship, a cognitive task, 
and a subjective experience on the patient's 
part of being evaluated). We should logically 
expect to find other behavioral correlates of 
Rorschach behavior. Loose or uncontrolled 
responses to (hypothesized) affect-arousing 
stimuli should parallel analogous behaviors 
in similar situations. The patient who “ex- 
plodes” during the Rorschach (or phrased less 
strongly, the patient who loses the ability to 
control his response to color) would be the 
person we might expect to “explode” in other 
situations. 

Our results support this second view. Ag- 
gressive responses on the Rorschach are not 
negatively corrected with aggressive acting- 
out, they tend to be positively correlated with 
it. This has implications both for making pre- 
dictions of such behavior from the Rorshach 
and for our understanding of the relationship 
between fantasy * and behavior. The “cathar- 
sis” view of test responses is not uncommon 
among clinicians, especially in the matter of 
blatant sex responses. The idea is often pro- 
posed that the person who reports several 
penises and vaginas will be a mature person 
who is capable of expressing his impulses in 


an appropriate manner and hence less likely. 


to indulge in antisocial sexual behavior. Our 
hypothesis would be that this is an incorrect 
supposition. Research should show that indi- 
viduals committing rape or indulging in self- 
destructive promiscuous behaviors will pro- 
duce significantly more blatant sex responses 


8 “Fantasy” as used here refers to an unknowable 
construct, literally “the person’s inner world,” the 
contents of which can only be inferred from verbali- 
zations, parapraxes, etc. 


than patients hospitalized for other reasons. 
The hypothesis should hold even for veiled or 
suggestive test responses. One can recall the 
Levine, Chein, and Murphy (2) study in 
which responses of objects instrumental in 
eating (in addition to actual food responses) 
increased with food deprivation. Our hypothe- 
sis would be that except for rare instances, 
such as the autism of the schizophrenic, can 
fantasy serve as a contraindication of analo- 
gous overt behavior. Even the schizophrenic 
girl preoccupied with sexual fantasies will dis- 
play seductive and exhibitionistic behaviors 
on many occasions. It is our view that evi- 
denve for the “catharsis theory of test re- 
sponses” is based largely on inadequate sam- 
ples of the person’s behavior. It may be true 
that the girl with very strong sex fantasies 
will act prim and prudish when we interview 
her. Yet examine her behavior over the past 
year, or five years, or when she “let herself” 
have a few drinks, then we will find marked 
resemblances between fantasy life and other 
behaviors. Fortunately, in using case histories, 
we were able to view a sizable portion of the 
individual’s life. In studies involving artificial 
stimulus situations or small isolated samples 
of behavior, we should not expect to find high 
correlations between fantasy contents and be- 
havior. 


Summary 


The present study aimed at assessing the 
relationship between assaultive behavior and 
two types of color responses, aggressive and 
nonaggressive. The prediction was made that 
Ss giving aggressive color responses should 
show more assaultive behavior than the non- 
aggressive color Ss. 

One investigator looked through approxi- 
mately 200 Rorschach protocols of male pa- 
tients in a VA hospital and listed the names 
of Ss giving aggressive color responses and 
those giving nonaggressive color responses. 
The second investigator was given the list 
(without any information as to the category 
of the S) and was asked to rate the case his- 
tory of the S as to incidents ‘of verbal and 
physical assaultiveness. The hypothesis was 
confirmed. The trends were especially clear in 
cases where S had given both aggressive color 
and aggressive movement responses. 


62 Robert Sommer and Dorothy T. Sommer 


Although the results were of theoretical 
relevance in assessing the relationship be- 
tween fantasy and overt behavior, the correla- 
tions were not of sufficient magnitude to per- 
mit their use in individual prediction of as- 
saultive behavior. 


Received March 25, 1957. 


References 


1. Finney, B. Rorschach test correlates of assaultive 
behavior. J. proj. Tech., 1955, 19, 6-17. 

2. Levine, R., Chein, I, & Murphy, G. The relation 
of the intensity of a need to the amount of 


perceptual distortion: A preliminary report. 
J. Psychol., 1942, 13, 283-293. 

. Lindner, R. M. The Rorschach test and the diag- 
nosis of psychopathic personality. J. crim. 
Psychopath., 1943, 5, 69-93. 

. Lubar, G. H. Rorschach content analysis. J. clin. 
Psychopath., 1948, 9, 146-152. 

. Phillips, L., & Smith, J. G. Rorschach interpreta- 
tion: Advanced technique. New York: Grune 
& Stratton, 1953. 

. Pitluck, Patricia. The relation between aggressive 
phantasy and overt behavior. Unpublished doc- 
toral dissertation, Yale Univer., 1950. 

. Stone, H. The relationship of hostile-aggressive be- 
havior to aggressive content on the Rorschach 
and Thematic Apperception Test. Unpublished 
doctoral dissertation, U.C.L.A., 1953. 


Journal of Consulting Psychology 
Vol. 22, No. 1, 1958 " 


A Normative Note on Sentence Completion 
Cross-Sex Identification Responses 


Francis W. King 
Dartmouth College 


The interpretation of projective responses 
may depend heavily for years upon such real, 
if elusive, factors as clinical experience and 
judgment. Quantitative data, however, can 
make more explicit and more communicable 
some aspects of clinical experience; further- 
more, it can correct individual bias in inter- 
pretation. 

Normative data establish a more objective 
frame of reference and should serve in some 
instances to taper the more extravagant in- 
ferences. The meaning of cross-sex identifica- 
tion responses is a case in point. Eron (1) has 
reported that 30.7% of the nonhospitalized 
males in his normative TAT study immedi- 
ately misidentified the sex of the figure in 
Card 3BM; an additional 29.4% evidenced 
uncertainty as to the figure’s sex, with a 
subsequent 18.7% misidentifying and 10.7% 
correctly identifying. On the other hand, 
Mainord (3) found that 94.7% of her male 
subjects in a figure-drawing study drew the 
self-sex (male figure) first. With these nor- 
mative differences, it woull seem that the 
clinician who makes comparable interpreta- 
tions from such cross-sex identifications does 
so at his peril. 

In view of the near absence of normative 
data on sentence completion responses, the 
writer undertook the following investigation. 
A sentence completion form was administered 
to an incoming freshman class prior to the 
beginning of classes. Using a sample of 212 
male students, who subsequently took the 
elementary psychology course, the writer 
analyzed the results of sentence stems that 
yielded data on identification. The following 
three items produced cross-sex identification 
responses. 


63 


23. When he found the trunk with all the clothes 
in it, just for fun he dressed up as —— 
39. In the college play he took the part of —— 


43. He hoped that some day he would be just 
like —— 


The results in Table 1 indicate that cross- 
sex identification responses are to some extent 
a function of the stimulus properties of the 
sentence stems, or, if one prefers, a function 
of the differing levels of personality tapped 
by such stems. Thus, Item 43, a rather “ob- 
vious” item, produces cross-sex identifications 
in only 1% of the population studied; in 
most cases it evokes a variety of fairly “pub- 
lic” expressions of ego ideal or desires readily 
accessible to awareness. Item 23, in contrast, 
produces a rather large number (20%) of 
cross-sex identifications, as well as some prim- 
itive and regressive responses revealing iden- 
tification with animals and children. Different 
sentence stems evoke responses from differing 
levels of personality organization; this clinical 
observation is borne out by the research of 
Hanfmann and Getzels (2). The whole area 
is one which should be explored systematically 
and thoroughly if the greatest usefulness is 
to be derived from this easily administered 
and productive technique. Of the 43 subjects 
who responded with feminine identifications 
on Item 23, only three gave female responses 
to Item 39 and only one to Item 43; an addi- 
tional five gave “residual” responses in Item 
39. This residual category contains a miscel- 
lany of responses that are nonspecific with 
respect to sex; some of these certainly reflect 
other than healthy processes, e.g., “death,” 
“a nobody,” “dope.” No subject responded 
with feminine identifications in both Item 39 
and Item 43. It might be mentioned in pass- 


ii 


64 Francis W. King 


Table 1 


Summary of Responses by Categories in Sentence 
Stems Producing Cross-Sex Identifications 
(N = 212 college males) 


Identification Response 


Item 23 


Male identification 

Female identification 

Animal identification 

Child identification 

Residual miscellaneous responses 
Item 39 


Male identification 

Female identification 

Animal identification 

Child identification 

Residual miscellaneous responses 
Item 43 


Father identification 112 
Female identification 2 
Identification with other male relative 20 
Identifiable male figure (real or fictional) 16 
Ideal or hero 11 
Occupational responses 9 
Residual male figure 27 
Responses emphasizing individual 
uniqueness 8 
Residual miscellaneous responses 7 


omw 


ing that two subjects replied with animal 
identifications in both Items 39 and 43. 

In addition to evoking responses which lend 
themselves to classification with some ease 
(especially Items 23 and 39), these stems 
produce idiosyncratic responses which are 
clinically rich and serve well as bases for hy- 


potheses concerning the individual’s personal- 
ity organization with reference to unconscious 
needs, ego ideal, fantasied self-image, and the 
like. The wide variety of individual replies to 
such sentence stems insures that the intuitive 
and imaginative clinician will have ample op- 
portunity to exercise his skills and also makes 
it obvious that the classificatory tone of this 
note cannot be suggestive of a substitute 
method of analysis. 

The problem of making valid inierpreta- 
tions of an individual’s responses to projective 
techniques will never be resolved by a com- 
pletely normative approach. Those who prize 
the clinical mystery need not fear that such 
techniques will be reduced to a clerical op- 
eration. However, the results presented here 
should serve to caution the interpreter to take 
due cognizance of the variability in projective 
content evoked by different stimulus materials 
—in this case different sentence stems. Fur- 
thermore, here are provided some beginning 
normative data on three clinically useful 
items. 

Received March 25, 1957. 


References 


1. Eron, L. D. A normative study of the Thematic 
Apperception Test. Psychol. Monogr., 1950, 
64, No. 9 (Whole No. 315). 

2. Hanfmann, Eugenia, & Getzels, J. W. Studies of 
the Sentence Completion Test. J. proj. Tech., 
1953, 17, 280-294. 

3. Mainord, Florence R. A note on the use of figure 
drawings in the diagnosis of sexual inversion. 
J. clin. Psychol., 1953, 9, 188-189. 


N Y% 
143 
43 
4 
2 
20 
177 
6 
4 
1 5 
24 


Journal of Consulting Psycholo; 
Vol. 22, No. 1, 1958 


Some Determinants of the Perception of Hostility 


Bernard I. Murstein* 
University of Texas, M. D. Anderson Hospital and Tumor Institute 


The relationship between the responses ob- 
tained through the use of projective tech- 
niques such as the Rorschach and the overt 
behavior of persons has been one of the per- 
sistent problems confronting the clinical psy- 
chologist. Gluck, in studies with the TAT and 
Rorschach (3, 4), found no correlation be- 
tween overt hostile behavior and overt and 
covert signs of hostility on the Rorschach and 
TAT. Cattell (1) has stated a theory hold- 
ing for a proportionality of overt and covert 
personality deviation. Murstein (5), defining 
projection as “the ascribing of one’s own mo- 
tivations, feelings and behavior to other per- 
sons” referred to five variables which he con- 
sidered to be functions of projection: (a) the 
objective characteristics of the person, (5) 
the self-concept, (c) the amenability of the 
trait investigated to measurement, (d) the 
conditions under which projection is meas- 
ured, and (e) the instrument used to measure 
projection. 

Findings related to each of these variables 
have also been reported. All of them were of 
some significance in either of two conditions 
(Rorschach, ego threat) investigated. Gluck’s 
failure to find any relationship between the 
Rorschach and overt behavior may well have 
been a function of the failure to take cog- 
nizance of the above variables. 

Granted that these variables are important 
in the prediction of the operation of the 
mechanism of projection, the question arises 
as to whether there are any significant rela- 
tionships between the tendency to project 
hostility on the Rorschach and the tendency 
to perceive hostility as the result of an ego- 
threatening situation. The procedure for in- 
vestigating this question follows. 


1 Now at Louisiana State University. 


Procedure 


The method of pooled ranks was used to 
obtain group and self judgments of hosiility 
on 536 students from 23 fraternities and two 
dormitories at The University of Texas by 
means of cutting scores based on considera- 
tion of the standard error of measurement. 
The method has been discussed more fully 
elsewhere (5). Eighty men, divided equally 
among four experimental groups (hostile-in- 
sightful, hostile-noninsightful, friendly-insight- 
ful, and friendly noninsightful) were selected. 

Each S of every group was given a Ror- 
schach test without formal inquiry for deter- 
minants. Only the content, animation, and 
description of the perceptions was recorded, 
in a manner similar to that followed by Elizur 
in his Rorschach Content Test (2). The in- 
structions given to each S were: 


I am going to show you a series of cards which are 
in reality inkblots. These inkblots have been chosen 
because they resemble many things. I want you to 
look at each of these cards and to tell me the first 
three things that you see on them. Do not hesitate 
to tell me anything that you see. Try to give a full 
description of what you see rather than stopping 
after you have given just the name. There are no 
right or wrong answers as this is not a test. There 
are no restrictions at all as far as what you see. It 
is entirely up to you. I will write down on this sheet 
of paper whatever you tell me that you see. Are 
there any questions? 


After all questions had been answered, the 
Rorschach cards were presented in the usual 
manner. At the conclusion of the administra- 
tion of the cards, the examiner, after carefully 
glancing over the S’s record, said, “Now I 
shall give you a brief interpretation of what 
you have seen.” 

Each experimental group of 20 Ss was di- 
vided and each half matched according to the 


65 


66 


means of group rankings and self-rankings 
and their respective variances. Within each 
group, one half received the interpretation 
“friendly,” while the other half received the 
interpretation “hostile.” Initially, in selecting 
the Ss, the experimental group to which each 
S belonged was of course known. The author 
detached the name from the group so that at 
the time of testing (five weeks later) all that 
was known of any S was whether he was to 
receive a “friendly” or a “hostile” interpreta- 
tion. The purpose of this procedure was to 
forestall the possibility of bias on the part of 
the author which might favor one group over 
another. 

The following points were stated in the 
“friendly” report: 


1. You cooperated very nicely in taking this test. 
You saw things readily because you were interested 
and really “put yourself” into it. 2. Your perceptions 
are very rich in creativity and imagination. 3. Your 
perceptions reveal a lot of feeling for people, and a 
lot of warmth; you are a friendly, cooperative per- 
son. 4. Your perceptions indicate a deep sensitivity 
to the needs of others. 5. You are, therefore, psycho- 
logically speaking, a mature and fairly well-adjusted 
person. Do you have any questions? 


The following points were stressed in the 
“hostile” report: 


1. You showed a lack of cooperation in taking this 
test. Your perceptions indicate that you were bored, 
disinterested, and didn’t bother with the test. 2. Your 
perceptions are, accordingly, poor in imagination and 
indicate a lack of creativity. 3. You are pretty “cold” 
toward people and an uncooperative, hostile person: 
4. Your perceptions indicate a lack of sensitivity to 
the needs of others. 5. You are, therefore, psycho- 
logically speaking, immature and not too well-ad- 
justed. Do you have any questions? 

The most frequent question was, “Where did you 
get that from?” In this case, a response was chosen 
at random and the examiner merely stated again that 
this had either a “friendly” or “hostile” connotation 
(accordingly), at a deeper psychological level. 


The examiner then presented the Examiner 
Rating Sheet (henceforth referred to as ERS) 
to the S, saying, 


Now, I shall ask you to fill out anonymously a 
rating sheet whereby you evaluate me. This is a part 
of some research that the psychology department is 
currently running. Do not put your name on this 
sheet. When you have finished the ratings, put the 
sheets in this envelope and seal it and drop it in 
this box. I shall return in about five minutes to see 


Bernard I. 


Murstein 


whether or not you are finished and ready to go on 
to the last part of the study. 


The ERS consisted of 19 statements taken 
from the larger Interview Rating Scale used 
by The University of Texas Testing and Guid- 
ance Bureau (7). A typical statement taken 
from this sheet was “The interviewer is a 
warm, sincere individual.’ The S rated each 
statement on a five-point scale: five for 
“strongly agree,” four for “agree somewhat,” 
three for “undecided,” two for “doubtful,” 
and one for “strongly disagree.” The higher 
the score, the more favorably the S rated the 
interviewer. 

The ERS was filled out and returned, and 
then the examiner explained that the report 
which he had given the S was used for experi- 
mental purposes and was not meaningful. The 
random nature of selection of the “hostile” re- 
port was explained, and the fact that one per- 
son out of two in the total group received a 
“hostile” report seemed to reassure most of 
the Ss. None of the Ss seemed disturbed by 
the experiment. Before leaving, the Ss were 
asked not to mention any aspect of the ex- 
periment to anyone for four weeks, and were 
also thanked for their participation. To meas- 
ure the projection of hostility on the Ror- 
schach, a new scale was devised called the 
Rorschach Hostility Scale (RHS). This scale 
has a range of from one to seven points for 
any given response on the Rorschach. It is 
essentially a two dimensional scale, one di- 
mension being an overt-covert one, while the 
other is a lower-phyla—higher-phyla range. In 
other words, a mangled human would be 
scored higher with regard to hostility than a 
mangled insect. This scale has been shown 
to be quite reliable when scored by trained 
scorers, with an average reliability of .96 (4). 
The RHS was used to measure the hostile 
content of the Rorschach protocols of all 
80 Ss. 


Results 


In order to test the relationship between the 
RHS scores and the ERS scores, it was first 
necessary to transform the raw scores into 
standard scores. By comparing the two sets of 
scores in relation to the variables group judg- 
ments of person (“hostile” or “friendly”), 
self-concept (“hostile” or “friendly”), and 


Some Determinants of the Perception of Hostility 67 


kind of interpretation received (“hostile” or 
“friendly”), 27 possible combinations were 
arrived at, which are indicated in Table 1. 

Since the RHS distribution was quite 
skewed as compared to a normal distribution, 
when tested by the chi-square distribution 
(p < .01), a Pearson correlation between the 
two sets of scores (ERS and RHS) was not 
computed. Instead, the scores were converted 
into ranks and correlated by means of Ken- 
dall’s Tau (6). None of the correlations were 
significant at the .05 level. This is not unex- 
pected since the use of ranks discards avail- 
able information. A test for differences in lo- 
cation, however, using Wilcoxon’s ¢ for paired 
replicates (8), resulted in nine significant dif- 
ferences at the .05 level, which for 27 com- 
putations was significant beyond the .001 
level (9). These significant t’s are indicated 
in Table 2. 

The main question arising from the data 
was, which of the three groups of variables, 
“kind of interpretation” received, “group 
evaluation” of the S, and the “self-concept,” 
showed the most significant change in the per- 
ception of hostility from the Rorschach to the 
examiner rating situation. It is apparent from 
examination of Table 2 that the “kind of in- 
terpretation” given is involved in the greatest 
number of significant differences, seven. The 


group evaluation shows five significant dif- 
ferences, while the self-concept shows four. 
The four significant groupings receiving a 
“friendly” interpretation show that these Ss 
perceived significantly less hostility in the ex- 
aminer than they did on the Rorschach. In 
other words, as compared to the amount of 
hostility they projected on the Rorschach, Ss 
who were told that they were “friendly” per- 
sons, perceived the examiner as being signifi- 
cantly more friendly and competent. The same 
is true of the three significant differences in- 
volving the “friendly” persons. Here, persons 
who were friendly, perceived little hostility in 
the examiner in comparison to the hostility 
they perceived on the Rorschach. This fact 
occurred despite the variation in the kind of 
interpretation received and in the perception 
of the “self.” On the other hand, a greater 
perception of hostility occurred on the ERS 
than Rorschach for the three “hostile” inter- 
pretation groupings and one “hostile,” per- 
sons group. Perhaps unexpected is the fact 
that hostile insightful persons (group evalua- 
tion “hostile,” self-concept “hostile’’) per- 
ceived more hostility on the Rorschach than 
as a result of ego-threat (examiner situation). 
The self-concept “friendly” variable showed 
significant differences when interacting with 
the “friendly” interpretation variable, with 


Table 1 


All Possible Combinations (27) of the Variables, Group Interpretation “Hostile,” Group Interpretation 
“Friendly,” Group Evaluation “Hostile,” Group Evaluation “Friendly,” 
Self-Concept “Hostile,” and Self-Concept “Friendly” 


Variables 


Number 
of 


Subjects Variables 


All persons 

“Hostile” interpretation 

“Friendly” interpretation 

Group evaluation “‘hostile”’ 

Group evaluation ‘‘friendly”’ 

Self-concept “‘hostile”’ 

Self-concept “friendly” 

Group evaluation ‘‘hostile’’ + ‘friendly’ interpreta- 
tion 

Group evaluation “‘hostile’’ + ‘hostile’ interpretation 

Group evaluation “friendly” + ‘‘friendly” interpre- 
tation 

Group evaluation ‘“‘friendly” + “hostile” interpreta- 
tion 

Self-concept “‘hostile’’ + “friendly” interpretation 

Self-concept “hostile” + ‘‘hostile’’ interpretation 

Self-concept “‘friendly” + ‘‘friendly” interpretation 

Self-concept ‘‘friendly"’ + ‘‘hostile” interpretation 

Group evaluation “hostile” + self-concept “hostile” 


BBsss 8 88 Bsssssses 


Group evaluation “hostile’’ + self-concept ‘‘friendly”’ 

Group evaluation “‘friendly” + self-concept “‘hostile” 

Group evaluation “friendly” + self-concept “friendly” 

Group evaluation “hostile” + self-concept “‘hostile” 
+ “friendly” interpretation 

Group evaluation “hostile” + self-concept “hostile” 
+ “hostile” interpretation 

Group evaluation “‘hostile’’ + self-concept “friendly” 
+ “friendly” interpretation 

Group evaluation “‘hostile” + self-concept “friendly” 
+ “hostile” interpretation 

Group evaluation “friendly” + self-concept “‘hostile”’ 
+ “friendly” interpretation 

Group evaluation “friendly” + self-concept “‘hostile”’ 
+ “hostile” interpretation 

Group evaluation “friendly” + self-concept “friendly” 
+ “friendly” interpretation 

Group evaluation “friendly” + self-concept “friendly” 
+ “hostile” interpretation 


Number 
of | 

Subjects — 
20 
20 
20 
10 
10 
10 
10 
10 
10 
10 
10 


Bernard I. Murstein 


Table 2 


Groupings Showing Significant Differences Between RHS and ERS Scores as Determined by 
Wilcoxon Paired Replicate ¢ Tests 


Groups 


ERS = perception of hostility 
higher in ERS than RHS 


RHS = perception of hostility 
higher for RHS than ERS 


“Hostile” interpretation 

“Friendly” interpretation 

Group evaluation “friendly” 
interpretation 
interpretation 


interpretation 


interpretation 


8 8 8 8 = 


“hostile” 


ERS 
RHS 
RHS 


Group evaluation “hostile” + “hostile” 


ERS 


Group evaluation “friendly” + “friendly” 
Self-concept “friendly” + “friendly” 
Self-concept “friendly” + “hostile” 
Group evaluation “hostile” + self-concept 


Group evaluation “friendly” + self-concept 


“friendly” + “friendly” interpretation 


* Significance beyond .0S5 level. 
** Significance beyond .02 level. 
*** Significance beyond .01 level. 


more hostility being perceived as occurring 
on the Rorschach than in the examiner. 


Lastly, those persons whose self concept was 
“friendly,” but who received a “hostile” in- 
terpretation, perceived more hostility in the 
examiner than they did on the Rorschach. 


Discussion 

The situational quality of projection would 
seem to be strongly emphasized by the results. 
Apparently, the objective personality and the 
self-concept are less important than the kind 
of situation in which the S finds himself. Be- 
ing told that they were hostile caused the ma- 
jority of Ss to project this hostility on to a 
suitable object (the examiner giving the in- 
terpretation) by perceiving him as the “hostile 
one.” A “friendly” report elicited in the main, 
little perception of the interpreter as hostile. 
The situation is not, however, the sole 
determinant of the operation of projection. 
Those Ss who were “friendly” tended to per- 
ceive less hostility in the examiner as com- 
pared to the Rorschach no matter what kind 
of interpretation they received nor what their 
- self-concept. The same was not true of the 
“hostile” Ss, who did not show any significant 


difference when considered as an entity. Ap- 
parently, the perceptions of “hostile” Ss were 
more readily influenced by changes in the en- 
vironment studied than was true of “friendly” 
Ss. Nevertheless, hostile insightful Ss (group 
evaluation “hostile,” self-concept “hostile”’) 
projected considerably more hostility on the 
Rorschach than in the examiner situation. 
That this fact is not due to the mere posses- 
sion of “hostility” may be gleaned from the 
fact that the group evaluation “hostile,” self- 
concept “friendly” group did not show any 
significant difference in their perception of 
hostility in the two situations. It seems safer 
to conclude, therefore, that the projection of 
hostility on the Rorschach in this study was 
a function of the possession of hostility in 
conjunction with insight into this fact. More- 
over, the fact that “friendly” Ss (without con- 
sidering their self-concepts) tended to mani- 
fest more hostile perceptions on the Rorschach 
than they did with regard to the interpreter 
seems at variance with some of the current 
thought about the meaning of responses 
elicited by projective techniques, namely, 
that elicited responses are representative of 
a person’s phenomenological world. Appar- 


68 

05* 

0s* 
RHS 
RHS 
ERS 
0s* RHS 

= 
RHS 


ently, clinical psychologists might well take 
note of the fact that elicited hostile responses 
on the Rorschach may at times reflect only 
the person’s freedom from fear in expressing 
that which is indicative of his “life style.” 
Conversely, the absence of hostile responses 
in a protocol may for some persons be in- 
dicative not of nonhostile feelings, but rather 
of a reluctance or fear of expressing strong 
hostile feelings through perception on the Ror- 
schach. Indeed, from a_psychotherapeutic 
point of view, those persons who are hostile 
but deny this fact (hostile—noninsightful) 
might be considered to be more deeply dis- 
turbed than those who are hostile, possess 
insight into this fact, and are willing to ac- 
knowledge their possession of hostility openly. 
Seemingly therefore, a revision of the rationale 
underlying the Rorschach is needed. 

The concept of one’s “self” as friendly is 
of importance in interaction with other vari- 
ables. It is the only variable that showed a 
shift in the perception of the examiner from 
“friendly” to “hostile” when the kind of inter- 
pretation given was changed from “friendly” 
to “hostile.” This serves to confirm the stand 
that these persons were operating from a 
phenomenological frame of reference, in that 
when the interpretation given was in accord- 
ance with the self-concept (“friendly” inter- 
pretation, “friendly” self concept) there was 
little perception of hostility on the part of the 
examiner. When the interpretation differed 
from the self concept (“hostile” interpreta- 
tion, “friendly” self concept), the perceived 
threat to the “self” resulted in the projection 
of the S’s hostility on to the interpreter. 


Summary 

The results seem to indicate that Ss pro- 
jecting hostility in one situation (Rorschach) 
show significant differences from the tendency 
to perceive hostility in another situation (re- 
action through rating of the examiner’s inter- 
pretation of their protocol). These differences 
have been shown to be related to such vari- 
ables as kind of interpretation of their proto- 
col given to the Ss by the interpreter, group 
evaluation of the S by his fraternity brothers, 
and the S’s self concept. Although each of 
these variables was related to the perception 


Some Determinants of the Perception of Hostility 69 


of hostility, the strongest determinant ap- 
peared to be the emotional climate created by 
the interpretation given to the Ss regarding 
their Rorschach performance. Those Ss called 
“hostile” reacted to this threat to the “self” 
by rating the interpreter as “hostile.”” Those 
Ss called “friendly” rated the interpreter as 
competent and “friendly.” 

An additional finding was that “friendly” 
Ss tended to perceive more hostility on the 
Rorschach than they did in the examiner rat- 
ing situation. This finding was not true of 
“hostile” Ss whose perceptions were influenced 
by whether they received a “friendly” or 
“hostile” interpretation of their Rorschach 
protocols. Lastly, the finding that “friendly” 
Ss, and “hostile-insightful” Ss showed a sig- 
nificantly greater production of hostile re- 
sponses on the Rorschach than in the ex- 
aminer rating situation seems to warrant 
further investigation as to the meaning of 
responses obtained on the Rorschach. 


Received April 4, 1957. 


References 


1. Cattell, R. B. Principles of design in “projective” 
or misperception tests of personality. In H. H. 
Anderson &-Gladys L. Anderson (Eds.), An 
introduction to projective techniques. New 
York: Prentice-Hall, 1951. Pp. 55-98. 

2. Elizur, A. Content analysis of the Rorschach with 
regard to anxiety and hostility. J. proj. Tech., 
1949, 13, 247-284. 

3. Gluck, M. R. The relationship between hostility 
in the TAT and behavioral hostility. J. proj. 
Tech., 1955, 19, 21-26. 

4. Gluck, M. R. Rorschach content and hostile be- 
havior. J. consult. Psychol., 1955, 19, 475-478. 

5. Murstein, B. I. The projection of hostility on the 
Rorschach, and as a result of ego-threat. J. 
proj. Tech., 1956, 20, 418-428. 

6. Smith, K. Distribution-free statistical methods and 
the concepts of power efficiency. In L. Fes- 
tinger & D. Katz (Eds.), Research methods in 
the behavioral sciences. New York: Dryden 
Press, 1953. Pp. 536-577. 

7. University of Texas Interview Rating Scale (Form 
A) Testing and Guidance Bureau (mimeo- 
graphed). Austin: Univer. of Texas, 1955. 

8. Wilcoxon, F. Some rapid approximate statistical 
procedures. New York: American Cyanamid, 
1949. 


9. Wilkinson, B. A statistical consideration in psy- 
chological research. Psychol. Bull., 1951, 48, 
156-158. 


Journal of Consulting Psychology 
Vol. 22, No. 1, 1958 


Social Desirability as a Variable in the Edwards 
Personal Preference Schedule 


Norman L. Corah, Marvin J. Feldman, Ira S. Cohen, Walter Gruen, 
Arnold Meadow, and Egan A. Ringwall 
University of Buffalo 


Edwards (1, 2) has suggested that the fac- 
tor of social desirability contributes to the 
unreliability of paper-and-pencil personality 
tests. He has attempted to eliminate the in- 
fluence of this factor from his Personal Pref- 
erence Schedule (EPPS) by using a forced 
_ choice technique in which the item pairs are 
presumably equated for social desirability. 

The present investigation arose in conjunc- 
tion with another research project, involving 
part of the EPPS, in which the writers are 
engaged. In this project, all item pairs used 
by Edwards (3) to compare the variables of 
Achievement, Order, Succorance, Abasement, 
Heterosexuality, and Aggression were selected. 
Since each variable is paired twice with every 
other variable, a scale of 30 item pairs was 
obtained. The item pairs were arranged in 
random order on this short form. Preliminary 
use of the scale began to raise doubts as to 
whether many of the item pairs were equated 
for social desirability. Although the method 
of successive intervals presumably yields scale 
values for items comparable to those obtained 
by the method of paired comparisons (1, 4), 
Edwards made no attempt to check his items 
for judged social desirability after they had 
been paired. It seems possible that responses 
to pairs differ from responses to single items. 
A judge rating a pair of items for social de- 
sirability might make a clear-cut choice even 
though, in rating the items individually, he 
might assign both the same value. Hence, the 
present study investigates the efficacy of part 
of the EPPS (one-seventh of the total num- 
ber of item pairs) in eliminating the factor of 
social desirability when item pairs are con- 
sidered jointly. 


Method and Results 


All of the Ss used were members of intro- 
ductory psychology classes at the University 
of Buffalo. The 30 item pairs were first given 
to a group of 50 men and 31 women with the 
following instructions: “Select the statement 
in each pair which you think is more socially 
desirable—that is, the statement that would 
make another person look better to other peo- 
ple if it were said of him.” The data were ana- 
lyzed in several different ways to test for the 
influence of social desirability. 

First, if social desirability does not ma- 
terially influence choices, then a group of Ss 
who are forced to choose the member of the 
pair which they believe to be more socially 
desirable should choose each member with 
equal frequency (p = .50). The binomial ex- 
pansion was used as a basis for testing the 
above hypothesis. 

This hypothesis was rejected at the .05 
level for 20 of the item pairs (rejected for 17 
at the .01 level). The chance probability of 
obtaining 20 statistics significant at the .05 
level from 30 calculated statistics (7) is far 
beyond the .001 level. Of the 20 significant 
item pairs, Achievement was judged more de- 
sirable in seven pairs (a maximum of 10 pos- 
sible for each variable), Order in four pairs, 
Succorance and Abasement each in three pairs, 
Heterosexuality in two pairs, and Aggression 
in only one pair. When pairing the follow- 
ing variables: Achievement and Succorance, 
Achievement and Heterosexuality, Order and 
Aggression, Abasement and Heterosexuality, 
the item representing the former variable was 
always judged by the group to be more so- 
cially desirable than the latter. 


Social Desirability in the Edwards PPS 71 


It seemed possible that the group might 
have a systematic position preference for 
either the A or B members of the pairs re- 
gardless of content. The binomial expansion 
was also used as a basis for testing this pos- 
sibility. No significant overall trend was found 
for selecting either A or B members of the 
pairs. 

It is also interesting to note that the Ss ap- 
parently had little difficulty in making the 
judgments. The 30 item pairs were judged for 
social desirability by another group of 44 Ss, 
38 men and six women, as above with addi- 
tional instructions to give the judgment of 
“equal” where they thought it was appropri- 
ate. It might be expected that judges would 
have frequent recourse to the “equal” cate- 
gory if the item pairs were equated for social 
desirability. The percentages of “equal” judg- 
ments for the item pairs ranged from 11 to 
55 per cent, with a median of 25 per cent, 
but only 11 of the item pairs received more 
than 25 per cent judgments of “equal.” This 
method of judging item pairs seems to give 
essentially the same results as judging with- 
out the “equal” category. Of the 11 item pairs 
receiving more than 25 per cent judgments of 
“equal” in this second judging procedure, the 
null hypothesis had been retained for seven 
in the first procedure. Of the 19 item pairs 
receiving 25 per cent or less judgments of 
“equal” in the second procedure, the null hy- 
pothesis had been rejected for 14 in the first 
procedure. 

The second approach to analyzing the in- 
fluence of social desirability on item responses 
follows the method used by Edwards (3) with 
the exception that pairs rather than individual 
items were utilized. Edwards used the differ- 
ence between the separate A and B item scale 
values as the index of social desirability for 
each AB pair. For each pair of items in the 
present investigation, the percentage of the 
first group of 81 judges judging A as more 
desirable in each AB pair was used as the 
index of social desirability. These percent- 
ages ranged from 10 to 85 per cent, with a 
median of 53 per cent. 

The 30 item pairs were administered to an- 
other group of 207 men and 54 women who 
answered the items with standard instruc- 
tions, i.e., as the items applied to themselves. 


Because of the possibility of sex differences 
either in responding to or in judging the 
items, it was decided to keep the sex ratio 
proportional to that of the group which had 
judged the items for social desirability. Con- 
sequently, 87 answer sheets from the group 
of 207 men were randomly drawn and com- 
bined with those of the 54 women, giving a 
total NV of 141 for the group which answered 
the scale. The percentage answering A in each 
AB pair (ranging from 9 to 78 per cent, 
median of 49.5 per cent) was plotted against 
the item pair indices of social desirability de- 
scribed above. The product-moment correla- 
tion between the two variables was .88. The 
coefficient of determination indicates that .77 
of the total variance can be accounted for by 
the variation of the index of social desir- 
ability. 


Discussion 


There are at least three possible interpreta- 
tions of these results with regard to their im- 
plications for the EPPS as a whole. First, the 
30 item pairs used may constitute a biased 
sample of the total N of 210 item pairs. How- 
ever, there is no a priori reason to suppose 
that this is the case. The items used in this 
study were not chosen because it was felt that 
they were more susceptible t» the influence of 
social desirability. 

There is also the possibility that the sample 
of Ss used was not representative of the popu- 
lation from which the social desirability scale 
values were obtained. While Edwards (3) ob- 
tained his normative data from a large num- 
ber of colleges throughout the country, he 
used only University of Washington students 
as judges of the social desirability of his 
items. However, this interpretation appears to 
be untenable in the light of the data reported 
by Klett (6) in which social desirability rat- 
ings of the single items obtained from widely 
differing groups (high school students, Nisei, 
Norwegians) correlated highly with the rat- 
ings obtained by Edwards. 

The third alternative is that many item 
pairs on the EPPS may not be equated for so- 
cial desirability and that Edwards wholly or 
partially failed to eliminate this source of un- 
reliability from his scale. It should be noted 
that the correlation of .88 between percentage 


ih 
‘ 


72 Corah, Feldman, Cohen, Gruen, Meadow, and Ringwall 


of choice and social desirability obtained in 
this study is comparable to the correlation of 
.87 reported by Edwards (2) when his items 
were administered in the “Yes-No” type of 
inventory. While the 30 item pairs upon which 
the present correlation is based is relatively 


small, the correlation appears to be more_ 


likely the result of the method used to judge 
social desirability rather than the result of a 
biased or restricted sample. It would seem 
that, from responses to single items, one can- 
not generalize with impunity about the in- 
fluence of social desirability on forced-choice 
pairings. Our findings suggest that paired 
items can acquire contextual meaning which 
alters the values assigned to the same items 
when responded to separately. They are con- 
sistent with other investigations, e.g., Howes 
and Osgood (5), which demonstrate the influ- 
ence of context in changing the meaning of 
individual words. It appears likely that Ed- 
wards, in using the method of successive in- 
tervals to scale his items, achieved only a first 
approximation of equal pairings and that ad- 
itional judgments of the item pairs them- 
selves are necessary along with revisions in 
pairing before the variable of social desir- 
ability can be eliminated from the EPPS. 


Summary 


The present study investigated the extent 
to which social desirability influences re- 
sponses to a short form of the Edwards Per- 
sonal Preference Schedule. The results indi- 


cate that the factor of social desirability is 
still an important influence since, in a sta- 
tistically significant number of item pairs, 
judges had a definite preference for one mem- 
ber of the pair as being more desirable. Also, 
the preferences of judges in terms of social 
desirability correlated highly with actual 
choices of another group of Ss who took the 
scale as a standard personality test. 


Received October 21, 1957. 
Early Publication. 


References 


1. Edwards, A. L. The scaling of stimuli by the 
method of successive intervals. J. appl. Psy- 
chol., 1952, 36, 118-122. 

. Edwards, A. L. The relationship between judged 
desirability of a trait and the probability that 
the trait will be endorsed. J. appl. Psychol., 
1953, 37, 90-93. 

. Edwards, A. L. Edwards Personal Preference 
Schedule. Manual. New York: Psychological 
Corp., 1954. 

. Edwards, A. L., & Thurstone, L. L. An internal 
consistency check for scale values determined 
by the method of successive intervals. Psycho- 
metrika, 1953, 17, 169-180. 

. Howes, D. H., & Osgood, C. E. On the combina- 
tion of associative probabilities in linguisti 
contexts. Amer. J. Psychol., 1954, 67, 241-258. 

. Klett, C. J. The stability of the social desirability 
scale values in the Edwards Personal Prefer- 
ence Schedule. J. consult. Psychol., 1957, 21, 
183-185. 

. Sakoda, J. M., Cohen, B. H., & Beall, G. Test of 
significance for a series of statistical tests. 
Psychol. Bull., 1954, §1, 172-175. 


‘ournal of Psychology 
Va 22, No. 1, 


A Factorial Isolation of Psychiatric 
Outpatient Syndromes’ 


Mary Helen Tatom 
Legal Psychiatric Services, D. C. Department of Public Health 


This study represents an attempt to vali- 
date certain psychiatric diagnostic entities, 
using the technique of obverse factor analy- 
sis to establish modal patterns of symptom 
xroupings. If clinical classifications of pa- 
tients are valid, they should divide patho- 
logical traits into syndromes. It was the prob- 
lem of this paper to test the existence of 
four distinct disease reactions corresponding 
to psychiatric diagnoses in an outpatient 
population, and to establish by means of the 
technique of obverse factor analysis patterns 
of characteristics common to the members of 
each group and not possessed by members of 
other groups. 


Procedure 
Subjects 


The Ss were 20 male veterans who were 
receiving psychotherapeutic treatment in the 
Veterans Administration Mental Hygiene 
Clinic of the Washington, D. C., regional of- 
fice. Their ages ranged from 25 to 44 years, 
with a median age of 32 years. Five Ss were 
selected from each of the following four diag- 
nostic categories: Hysteric, Anxiety State, 
Schizophrenic, and Obsessive-Compulsive. At 
least two medical diagnoses were used in the 
selection of the psychoneurotic Ss. The thera- 
pist’s diagnosis was confirmed by a concur- 
ring diagnosis from another source. 

A somewhat different procedure was fol- 
lowed in the selection of schizophrenic Ss. 
Since these Ss had not been hospitalized, and 
prodromal symptoms had in some instances 


1 This article represents a treatment of one of two 
problems included in a doctoral dissertation, Catholic 
University of America. It was carried out using the 
facilities of the Veterans Administration Mental Hy- 
giene Clinic, Washington, D. C. 


73 


been previously described as an “anxiety 
state,” no confirming medical diagnosis was 
secured on these patients. They were selected 
on the basis of the current diagnosis of their 
respective treating physicians plus case his- 
tory. 


Measures 


The measures consisted of 67 rating scales 
which were essentially the same as those ana- 


Table 1 
Oblique Factor Matrix V = F A 


4444; 


lyzed by Lorr and Rubinstein (9) in an R- 
type factor analysis using a population of 
psychiatric outpatients. The following are ex- 
amples of the kind of traits which the scales 
were designed to measure: Emotional Reac- 
tivity, Concern with Conformity, Hostile Pas- 
sive Obstructionism, Compulsions, and Gas- 


Factor 
07 29 3 —.02 
—.26 —.02 —.49 
02 01 69 —.03 
13 00 —.04 —.49 
—.23 44 02 
—.20 49 Al —01 
—.08 44 58 
—.29 33 —.10 
00 57 04 
—.25 19 40 29 
11, S-1 20 —.17 —.03 
12. S2 38 13 38 12 
13. S3 32 27 07 
14. S4 05 .29 —.26 —.03 
15. S5 16 19 —.22 07 
16. O-1 .39 16 —.44 
17. 0-2 37 —.07 —.04 —.04 
18. 0.3 60 07 —.07 —.03 
19. O4 58 —01 —.36 —A7 
20. O-5 —.03 00 


Mary H. Tatom 


Table 2 
Trait Composition of Primary Factors 


Primary Factors 


Scale Description 


> 


B D 


Underreacts emotionally—overreacts 
Depressed—elated 

Even in mood—mood swings 
Inhibited—uninhibited 

Tense—phlegmatic 

Concerned with self—unconcerned 

Lacking self-assurance—self-assured 
Selfconscious—unselfconscious 
Submissive—dominant 

Rigid—fiexible 

Outgoing—seclusive 

Poorly adjusted socially—gains acceptance 

Rarely sees a task through—perseveres 
Effeminate—masculine 

Unsatisfied need for affection—no unsatisfied need 
Egocentered—nonegocentered 

Seeks support from others—does not seek support 
Lacks initiative—acts independently 

Avoids responsibility—accepts responsibility 

Feels rejected—feels accepted 

Perceives world as hostile—perceives world as friendly 
High level of achievement—low level of achievement 
Considers the future—does not consider the future 
Strong superego—weak superego 

Impulse control—lack of impulse control 
Orderly—disorderly 

Perfectionistic—unconcerned with tasks 
Concerned with conformity—nonconformist 
Hostile to authority—accepts authority 

Blames self—does not blame self 

Projects blame—does not project blame 

Denies the existence of difficulties—exaggerates difficulties 
Has insight—lacks insight 

Overt hostility—no overt hostility 

Hostile withdrawal—no hostile withdrawal 

Hostile passive obstructionism—no hostile passive obstructionism 
Interested in activities around him—lacks interest 
Frustration tolerance—lacks frustration tolerance 
Neeus approval—does not need approval 
Suspicious—trustful 

Concerned over hostility—unconcerned 
Overconcerned with symptoms—unconcerned with symptoms 
Autism—no autism 

Anxious—unworried 

Hostile—not hostile 

Irritable—not irritable 

Unrealistic ambition—no unrealistic ambition 

Sex conflict—no sex conflict 

Uses complaints—no use of complaints 

Unreality feelings—no unreality feelings 
Phobias—no phobias 

Suicidal thoughts—no suicidal thoughts 

Reality distortion—no reality distortion 

Ideas of reference—no ideas of reference 


+ 


+ + 


i++ +14 


++1+ + + 


74 
Scale 
No. 
+ 
a 
+ 
10 
12 ~ 
13 + _ 
14 + ~ 
15 + 
16 + - 
17 4 
18 
19 + = 
20 + 
21 + 
22 
23 - + 
24 
25 + 
26 + - 
27 _ + 
28 
29 + 
30 - - + 
31 + + - 
32 - + 


Factorial Isolation of Outpatient Syndromes 


Table 2—Continued 


Scale Description 


Primary Factors 


Obsessive thoughts—no obsessive thoughts 


Obsessive hostile impulses—no obsessive hostile impulses 


Compulsions—no compulsions 
Alcoholic addiction—no alcoholic addiction 


Marked behavioral reaction to alcohol—no behavioral reaction to alcohol 
Conversion symptoms—no conversion symptoms 


Anergic complaints—no anergic complaints 


Body preoccupation—no body preoccupation 


Headaches—no headaches 
Sleep difficulties—no sleep difficulties 


Intestinal symptoms—no intestinal symptoms 


Gastric symptoms—no gastric symptoms 


+ 


Cardiovascular symptoms—no cardiovascular symptoms 


tric Symptoms. The form of the rating scales 
is illustrated below: 


Trait 28. Concern with Conformity: 


1. Shows excessive and chronic concern for con- 
formity with standards of the group. Would hesitate 
long before committing a violation. 

2. Shows a fairly strong concern over adherence 
to social conventions. Feels uncomfortable if he is 
not conforming. 

3. Shows a slightly more than average concern 
over adherence to social conventions. 

4. Shows somewhat less than average concern over 
adherence to social conventions. 

5. Shows little concern over disregard of social 
conventions. 

6. Shows no concern over disregard of social con- 
ventions. 

X. Unrateable. (Specify) 


Ratings were secured from each patient’s 
individual therapist. Since the patients had 


been in therapy for varying lengths of time 
(one month to two years), an attempt was 


Table 3 


Oblique Reference Vectors in Relation to 
Unrotated Factors 


Rotation Matrix A 

Reference Vectors 

Unrotated 
Factors 


made to control the effect of changes that 
might have taken place during the therapeutic 
process by instructing the therapist to refer 
to the notes of his earliest sessions in rating 
those patients who, he felt, had changed un- 
der treatment. The data analyzed included 
ratings by psychiatrists, clinical psychologists, 
and social workers. To insure comparability of 
ratings, a rating guide was distributed to each 
person responsible for evaluating a patient. 
Further clarification was obtained by devot- 
ing a two-hour staff conference in the partici- 
pating clinic to a discussion of scale items. 


Table 4 


Relations of Oblique Reference Vectors to One Another 
A’‘A-Direction Cosines 


Vector A B D 


—.28 
—.05 61 
67 


Table 5 


Relations of Primary Factors to One Another 
D (A’‘A)* D 


I 
II 
iil 
IV 


75 
Scale 
No. ‘ £8 
SS - + 
56 
58 - + + 
59 in 
60 + 
+ + 
- 
63 + + - ; 
64 + 
- - - 
- - + 
B 
D 0 
D Factor To Tp Te Ta 
48 —.21 — 36 02 
48 28 —-43 =—Al Tp 00 
76 19 57 —.18 T. — 40 —.32 
— 40 .92 60 89 Ta 60 —.38 — 48 


Table 6 
Unrotated Factor Matrix F,* 


Table 8 
Rotation Matrix A, 


Second-order Factor 


I, Il, 
A -707 .236 
B —.192 .698 
— 461 — 498 
D 911 —.231 


* Subscript indicates second-order factors. 


Design 


The technique of obverse factor analysis 
was used to determine whether the four diag- 
nostic entities could be reproduced statisti- 
cally and also to determine any inherent 
relations among these so-called entities. The 
resulting factors might be considered as rep- 
resenting types of pathological adjustment, 
which could further be described in terms of 
the individual traits they comprise. 

Tetrachoric correlations were obtained 
among the 20 Ss, 5 from each diagnostic 
category. Traits were dichotomized at the in- 
terval nearest the median of the 20 patients. 
Using Thurstone’s (12) centroid method of 
factor analysis, loadings on four factors were 
extracted from the 20 X 20 correlation matrix. 
Factor extraction was stopped when the dis- 
tribution of residual r’s around zero approxi- 
mated a chance distribution. Rotation to sim- 
ple structure was carried out according to the 
method described by Thurstone (12, pp. 194— 
216). Simplicity of structure was given pri- 
ority over orthogonality. The correlations be- 
tween primary vectors were determined and 
the resulting matrix factored, yielding two 
second-order factors. These were rotated so 


Table 7 
Rotated Factor Matrix V, = F,A, 


Second-order Factor 


Primary 
Factor x ¥ 
A 530 001 
B — .026 514 
Cc — 422 — .233 
D .667 —.359 


Y 


I, .675 
Il, —.223 


.223 
675 


as to retain an orthogonal structure. The fac- 
tors arrived at statistically were examined as 
to the personality pattern they delineate and 
compared with present psychiatric concepts 
of disease entities. 


Results 


The Rotated (oblique) Factor Matrix is 
given in Table 1.* The following procedure 
was used in identifying primary factors: three 
individuals were selected as representatives of 
each factor, taking into consideration both 
saturation with that factor and relative pur- 
ity, freedom from loadings on other factors. 
Factor A was represented by individuals O-3, 
0-4, and O-5; Factor B by H-5, A-i, and 
A-4; Factor C by H-3, A-2, and A-5; and 
Factor D by H-2, H-4, and S-1. A trait was 
considered as contributing to a factor when 
all of the three representatives of the factor 
fell on the same side of the median in the 
trait as rated. The composition of the factors 
in terms of traits derived by this procedure 
is presented in Table 2. A pattern of distin- 


Fig. 1. Rotation of second-order factors. 


2 Correlation and centroid factor matrices may be 
obtained without charge from Mary H. Tatom, 3801 
Connecticut Ave., Washington 8, D. C., or from the 
American Documentation Institute. Order Document 
No. 5434, remitting $1.25 for microfilm or $1.25 for 
photocopies. 


76 Mary H. Tatom 
F 
(x) 
\ 
e 


Factorial Isolation of Outpatient Syndromes 


Table 9 
Factor X 


Scale Description 


Positive 


Negative 


Depressed 
Even mood 


Conspicuously lacking in self-assurance 


Submissive 

Rigid 

Seclusive 

Poorly adjusted socially 


Little interest in activities around him 


Uses complaints 


vs. Elated 

vs. Mood swings 

vs. Self-assured 

vs. Dominant 

vs. Fleutle 

vs. Outgoing 

vs. Gains acceptance socially 

vs. Interested in activities around him 
vs. Does not use complaints 


guishing traits for each primary factor was 
derived from Table 2 by assigning to each pri- 
mary factor those traits which differentiated 
it from all others, i.e., those traits which were 
not represented at all in any other factor, or 
which were not represented similarly (as to 
presence or absence) in another factor. Fur- 
ther discussion of the trait composition of 
primary and second-order factors will be in 
terms of these distinguishing (pathogno- 
monic) traits. 

The cosine matrix (Table 5) of the pri- 
mary factors was factored to obtain second- 
order parameters. The two second-order fac- 
tors were plotted as a step toward their or- 
thogonal rotation. Inspection of this plot 
shown in Fig. 1 indicates certain interesting 
relationships. The rotation indicated by dotted 


lines in Fig. 1 and as completed in Table 7, 
permitted a tentative identification of Factor 
X by examining those traits which differenti- 
ated between Factors A and C and were in no 
way involved in Factor B. Traits which dif- 
ferentiated between Factor C and Factor D 
were included if they were not involved in 
Factor B. Second-order Factor Y was tenta- 
tively identified by isolating those traits which 
were peculiar to B (for which there was no 
indication in other factors) or which differ- 
en:iated between B and D and did not dif- 
ferentiate between A and C. Of this group of 
traits, those which showed some relationship 
between D and either A or C were omitted, 
since D had some loading on X. Traits of 
second-order Factors X and Y are given in 
Tables 9 and 10, respectively. 


Table 10 
Factor Y 


Scale Description 


Positive 


Negative 


Uninhibited 
Rarely sees a task through 
Effeminate 


Unsatisfied need for affection 


Lack of superego 
Disorderly 
Unconcerned with tasks 
Does not conform 

Lack of insight 


Overconcerned with symptoms 


No phobias 
Suicidal impulses 


Inhibited 

Perseveres 

Masculine 

No unsatisfied needs to be loved 
Severe superego 

Orderly 

Perfectionistic 

Concerned with conformity 
Insight 

Not concerned with symptoms 
Phobias 

No suicidal impulses 


7 
No. 
2 
7 
9 
10 
11 
12 
37 
49 
Scale 
No. 
4 
14 
15 
24 
26 
27 
28 
33 
42 
$1 


78 Mary H. Tatom 


Discussion 

The factors isolated statistically divided pa- 
tients into groups which cut across clinical 
diagnostic classifications. The clinically diag- 
nosed Obsessive-Compulsives clearly defined 
Factor A. Factor B showed heaviest loadings 
in two patients who had been clinically diag- 
nosed as Anxiety State and one patient who 
had been diagnosed Hysteric. The three in- 
dividuals with highest loadings in Factor C 
had been diagnosed as Anxiety State (two pa- 
tients) and Hysteric (one patient). The nega- 
tive pole of Factor D was defined by two per- 
sons who had been diagnosed as Hysteric and 
one who had been diagnosed as Schizophrenic. 
In terms of traits, the syndromes derived by 
statistical analysis have been tentatively re- 
lated to clinical concepts as follows: 


Primary factors 
Factor A—Outpatient paranoid  schizo- 
phrenic 
Factor B—Conversion hysteric 
Factor C—Intrapunitive, socially mature 
personality vs. narcissistic per- 
sonality 
Factor D—Passive dependent personality 
vs. anti-social personality 
Second-order factors 
Factor X—-Schizothymia vs. cyclothymia 
Factor Y—Uncontrolled emotionality vs. 
overcontrolled emotionality 


Factor A 


It will be observed that traits specific to A, 
not involved in Factor X, are traits implying 
reality distortion, flattened affect, and para- 
noid ideation: 


Trait No. 


Underreacts emotionally. 
Perceives world as hostile. 
No overt hostility. 
Suspicious. 

Unrealistic ambition. 
Reality distortion. 

Ideas of reference. 

No alcoholic addiction. 
Sleep difficulties. 
Intestinal symptoms. 


This symptom pattern appears to be classi- 
cally schizophrenic. Clinically diagnosed ob- 
sessive-compulsive neurotics formed a homo- 


geneous group on the basis of the pattern of 
traits represented by Factor A, while clinically 
diagnosed schizophrenics showed relatively 
low loadings on all factors. The finding that 
clinically diagnosed schizophrenics had little 
common factor variance in this analysis sug- 
gests that certain essentially schizophrenic 
characteristics were not covered by the rating 
scales used. It is possible that paranoid schizo- 
phrenia developing in an obsessive-compulsive 
personality structure presents a consistent 
pattern, while other types of incipient schizo- 
phrenia show such varied overt symptoma- 
tology that early schizophrenia as such can- 
not be isolated as a clear-cut entity. Clinicians 
are familiar with the wide variety of behavior 
on which the diagnosis of incipient schizo- 
phrenia may be based: panic states, bizarre 
acting out behavior, a definite change in per- 
sonality, and general intellectual or behavioral 
disorganization. The patients who define Fac- 
tor A resemble ambulatory schizophrenics 
whose adjustment is stabilized at the level of 
minimal social functioning, who retain some 
controls, and who are not at present threat- 
ened with an acute psychotic episode. Such 
patients might receive the diagnosis of obses- 
sive-compulsive neurotic on the basis of char- 
acterological features such as rigidity, inhibi- 
tion, and absence of overtly hostile behavior, 
though neurotic symptoms were not present. 

Factor A is somewhat similar to Lorr and 
Rubinstein’s (9) Factor B and (10) Factor 
G, and to the negative poles of Cattell’s (4) 
Factors A, H, and L. 


Factor B 


Factor B is composed of a group of charac- 
teristics descriptive of unrestrained emotional 
behavior. The syndrome isolated here corre- 
sponds to some extent to the General Emo- 
tionality factors of Burt (2), Eysenck (6), 
Rao and Slater (11), and Cattell (4). In the 
present study, poor emotional control, an un- 
satisfied need for love and care, and lack of 
insight are associated with conversion symp- 
toms. In terms of psychiatric formulations, 
the group of traits which make up Factor B 
are those usually associated with conversion 
hysteria. The patients who defined this factor 
had been diagnosed Anxiety State (two pa- 
tients) and Hysteric (one patient). 


Factorial Isolation of Outpatient Syndromes 79 


Factor C 


Factor C is a bipolar factor, described at 
its positive pole by a group of patients who 
are socially adjusted, independent, and non- 
egocentered. On the pathological side, they 
tend to assume blame and present gastric 
symptoms. Perhaps in a clinic population, in- 
dependence and cooperativeress are achieved 
at the expense of inner strain. The negative 
pole of Factor C is characterized by a group 
of traits which involve a hostile, passive, and 
egocentered adjustment which corresponds to 
the clinical concept of narcissistic character 
disorder. The diagnosis Inadequate Person- 
ality is sometimes applied to this type of pa- 
tient. Though the negative pole of the factor 
is not defined by a group of patients, two of 
the clinically diagnosed obsessive-compulsives 
show significant negative loadings on Factor 
C (using .35 as a criterion of significance). 
The patients who defined Factor C had been 
clinically diagnosed Anxiety State (two pa- 
tients) and Hysteric (one patient). 

Traits which are specific to Factor C which 
do not appear in Factor X are as follows: 


Nonegocentered vs. egocentered. 

Acts independently vs. lacks initiative. 
Blames self vs. does not blame self. 

No hostile passive obstructionism vs. hostile 
passive obstructionism. 

Not hostile vs. hostile. 

Not irritable vs. irritable. 

Gastric symptoms vs. no gastric symptoms. 


The dimension involved in Factor C appears 
to be basically one of unselfishness vs. selfish- 
ness. Factor C resembles somewhat Lorr and 
Rubinstein’s (9) Factor E, which they termed 
dependent immaturity vs. independent ma- 
turity. It is also similar to Cattell’s (4) 
Source Trait A. Factor C corresponds in gen- 
eral to the surface trait which Cattell (4) de- 
scribes as infantile, demanding self-centered- 
ness vs. emotional maturity, frustration toler- 
ance. 


Factor D 


Factor D cannot be definitely described as 
a bipolar factor. However, since it is defined 
by a group of patients at its negative pole, 
positive loadings approach significance (.29), 
and the factor loadings in both directions are 


relatively low (— .49 to .29), characteristics 
of both poles are presented. 

Traits specific to D, not involved in second- 
order factors are as follows: 


Phlegmatic—-tense. 
Self preoccupied—attentive to external mat- 
ters. 
Seeks support from others—self-sufficient. 
Does not project blame—projects blame. 
Denies frustration—exaggerates frustration. 
Needs approval—does not need approval. 
Concerned over hostility—unconcerned over 
hostility. 
Sex conflict—no sex conflict. 
No obsessive hostile impulses—obsessive hos- 
tile impulses. 
No behavioral reaction to alcohol—marked be- 
havioral reaction to alcohol. 
No body preoccupation—body preoccupation. 
No headaches—headaches. 

67 Cardiovascular symptoms—no cardiovascular 
symptoms. 


This facior presents a syndrome character- 
ized at its positive pole by submissiveness, 
inner conflict, insecurity, guilt, and depend- 
ency and at its negative pole by conflict with 
authority, aggression, absence of guilt, and 
self-sufficiency. Cardiovascular symptoms are 
associated with the former and headaches with 
the latter. Introversion, guilt, conflict, and 
inhibition describe a personality structure as- 
sociated with the diagnosis of anxiety neu- 
rosis, and cardiovascular symptoms may per- 
haps be an autonomic expression of anxiety. 
The factor under consideration has been re- 
lated to the clinical diagnosis Passive De- 
pendency Reaction because of the absence of 
overt anxiety symptoms; however, inner con- 
flict and inadequacy feelings appear to be cen- 
tral to this syndrome, one of extreme inner 
discomfort. The basic dimension of Factor D 
is one of submission to outer authority and 
to superego dictates vs. rebellion against outer 
authority and absence of guilt. The cluster of 
traits which characterizes the negative pole 
of Factor D suggests a psychopathic deviate 
syndrome. The dimension represented by Fac- 
tor D appears to be very similar to Eysenck’s 
(6) factor which divided his population into 
a social misfit vs. a psychological conflict 
group. It also corresponds to some degree to 
Rao and Slater’s (11) factor, obsessive-com- 
pulsive vs. psychopath. Lorr and Rubinstein’s 
Factor D, which they described as a dimen- 


16 
18 
30 
36 
45 
46 
66 


80 Mary H. Tatom 


sion of inadequacy feelings, is very similar to 
Factor D. In both factors, headaches are as- 
sociated with absence of conscious inferiority 
feelings. Cattell’s (3) Source Trait E, ascend- 
ance vs. submission appears also to be com- 
parable to Factor D. 


Factor X 


This second-order factor has been termed 
a schizothymia vs. cyclothymia dimension in 
terms of the traits of Factors A and C which 
define its positive and negative poles respec- 
tively. In factorial studies, a schizothyme- 
cyclothyme factor has frequently appeared, 
and each pole has appeared as a separate fac- 
tor at times. Cattell (3) concludes that this 
dimension represents a basic temperamental 
difference in people, corresponding to Kretsch- 
mer’s dichotomy based on body type. 


Factor Y 


Factor Y represents a dimension defined at 
its positive pole by primary Factor B. Factor 
Y appears to be similar to the negative pole 
of Cattell’s (4) Factor C and to the factor 
General Emotionality found by Eysenck (5), 
by Rao and Slater (11), and by numerous 
other analysts. 

Perhaps the fact that clinical groupings of 
patients were lost in the statistical analysis 
can be explained to some extent by the ob- 
servation that obverse factor analysis may 
classify individuals into types on the basis of 
underlying generalities rather than on readily 
observable surface characteristics. In an ob- 
verse analysis of four types of geometrical 
solids (cones, rectangular prisms, rectangular 
pyramids, and cylinders), Lorr, Jenkins, and 
Medland (8) found that rotation yielded two 
bipolar composite factors: (a) tapering and 
curvilinear vs. nontapering and rectilinear; 
and (5) tapering and rectilinear vs. nontaper- 
ing and curvilinear. These represent underly- 
ing abstractions rather than physically dis- 
tinct principles of classification. The distin- 
guishing gestalts of the four geometrical 
solids did net emerge as factors, nor could 
they be derived from the factors. In other 
words, the clusters of geometric solids could 
be named only from sensory information. 
Bom Mo Chung (1) reported results compa- 
rable to those of the present investigation in 


a study in which he attempted to arrive at 
genetic species by an obverse factor analysis 
of the correlations between species based on 
14 genetic traits. He found only rough co- 
incidence of factorial types with conventional 
genetic species, but felt that this did not in- 
validaie the technique, since factorial types 
might furnish a valid basis of classification 
independent of other classifications of the 
data. Lorr and Fields (7) in a factorial study 
of body types found Sheldon’s three compo- 
nents to represent patterns which could be 
accounted for by two factors best described 
as muscle or fat, leg length or torso length. 

Primary factors in the present analysis are 
highly intercorrelated; perhaps the addition 
of other traits might define them more clearly. 
It is felt that second-order factors were clearly 
established in the present analysis, but that 
primary factors require cross-validation. Pri- 
mary factors were so heavily saturated with 
second-order factors that clusters of patients 
may have been formed largely in terms of 
only two dimensions, obscuring some clini- 
cally observable differences between patients 
and accounting to some extent for discrep- 
ancies between psychiatric and clinical clas- 
sifications. 

The results of the present study cannot be 
said to validate or invalidate either factorial 
classification of patients or their classification 
by conventional psychiatric nosology. It does, 
perhaps, explain to some extent the frustra- 
tion involved in attempting to validate psy- 
chological measures against clinical diagnoses. 


Summary 

The present investigation represents an at- 
tempt to test by the technique of obverse fac- 
tor analysis the existence of four distinct dis- 
ease reactions corresponding to psychiatric 
diagnoses in an outpatient population and to 
establish patterns of traits characteristic of 
each syndrome. 

Factorially isolated syndromes cut across 
clinical diagnostic lines, separating patients 
into groups which were different from the 
categories into which they had been placed by 
clinical diagnosis. The lack of agreement be- 
tween factorial and psychiatric groupings of 
individuals was not thought to invalidate 
either system of classification. 


Factorial Isolation of Outpatient Syndromes 81 


Traits were grouped by this study into ag- 
gregations which formed recognizable clinical 
entities and corresponded in general to fac- 
tors isolated in other factor analyses. Pri- 
mary and second-order factors were tenta- 
tively identified as follows: 


Primary factors 

Factor A—Outpatient paranoid schizo- 
phrenic. 

Factor B—Conversion hysteric. 

Factor C—Intrapunitive, socially mature 
personality vs. narcissistic per- 
sonality. 

Factor D—Passive dependent personality 
vs. anti-social personality. 


Second-order factors 
Factor X—Schizothymia vs. cyclothymia. 
Factor Y—Uncontrolled emotionality vs. 
overcontrolled emotionality. 


Received March 25, 1957. 


References 


1. Bom Mo Chung. Differentiation of group pat- 
terns by transposed factor analysis. Ewah 


Woman’s Univer. Press Stud. Psychol., Seoul, 
Korea, 1954, No. I. 


. Burt, C. The factorial study of temperamental 


traits. Brit. J. Psychol., 1948, 1, 178-203. 


. Cattell, R. B. The description and measurement 


of personality. Yonkers, N. Y.: World Book, 
1946. 


. Cattell, R. B. Personality. New York: McGraw- 


Hill, 1950. 


. Eysenck, H. J. Dimensions of personality. Lon- 


don: Kegan Paul, French, & Trubner, 1947. 


. Eysenck, H. J. Types of personality: A fac- 


torial study of seven hundred neurotics. J. 
ment. Sci., 1944, 90, 851-861. 


. Lorr, M., & Fields, U. A factorial study of body 


types. J. clin. Psychol., 1954, 10, 182-185. 


. Lorr, M., Jenkins, R. L., & Medland, F. F. Di- 


rect versus obverse factor analysis: A com- 
parison of results. Educ. psychol. Measmt, 
1955, 15, 441-449. 


. Lorr, M., & Rubinstein, E. A. Factors descrip- 


tive of psychiatric outpatients. J. abnorm. soc. 
Psychol., 1955, 51, 514-522. 


. Lorr, M., & Rubinstein, E. A. Personality pat- 


terns of neurotic adults in psychotherapy. J. 
consult. Psychol., 1956, 20, 257-263. 


. Rao, C. R., & Slater, P. Multivariate analysis ap- 


plied to differences between neurotic groups. 
Brit. J. Psychol. Statist. Sec., 1949, 2, 17-29. 


. Thurstone, L. L. Multiple-factor analysis. Chi- 


cago: Univer. of Chicago Press, 1947. 


3 
4 
| 
6 
10 
‘ 


Lorge, Irving, & Thorndike, Robert L. Lorge-Thorn- 
dike Intelligence Tests. Grades kgn.-12. 5 levels, 2 
forms at each level. Primary Battery (levels 1-2), 
(30) min., test booklet ($3.00 per 35), with key, 
record sheet, and manual, pp. 20. Verbal Battery 
(levels 3-5), 34.-(45) min., or Nonverbal Battery 
(levels 3-5), 27 (40) min. Consumable edition: 
test booklet ($2.40 per 35), with key, record sheet, 
and manual, pp. 32. Re-usable edition: test book- 
let ($2.40 per 35), with manual; IBM answer 
sheet ($1.20 per 35), with record sheet; IBM or 
hand-scoring key (21¢ ea.). Technical manual, pp. 
16 (12¢). Boston: Houghton Mifflin, 1957. 

The Lorge-Thorndike Intelligence Tests comprise 
eight different tests which span the range from kin- 
dergarten through high school. Each is available in 
two equivalent forms, and the three higher levels are 
furnished in two formats for use with or without 
separate answer sheets. For the Primary Battery, on 
two levels covering kindergarten and the first three 
grades, administration is oral and responses are made 
by marking pictures. Three types of items—oral vo- 
cabulary, pictorial classification, and pictorial pair- 
ing—are used. Levels 3 to 5, covering the range from 
Grade 4 to Grade 12, are provided with verbal and 
nonverbal batteries in separate booklets. The subtests 
of the Verbal Battery are word knowledge, sentence 
completion, verbal classification, verbal analogies, 
and arithmetic reasoning. The Nonverbal Battery has 


figure analogies, figure classification, and number 
series. 

The Technical Manual shows that the development 
and standardization of the tests involved the exami- 
nation of more than 136,000 children in 44 communi- 
ties in 22 states. Norms were based on a stratified 
sampling of communities rated in five groups by 
socioeconomic criteria. For all levels and tests, four 
types of norms may be read directly from raw 
scores: deviation IQs (M=100, SD=16), grade 
percentiles, grade equivalents, and age equivalents. 
Other technical data are fully presented and impres- 
sive. Alternate form reliabilities in single-grade sam- 
ples range from .76 to .89 with, not unexpectedly, 
the higher reliabilities appearing in the verbal tests 
at the higher levels. Corrected split-half reliabilities 
for single-grade groups run from .88 to .94. Verbal 
and nonverbal batteries correlate about .65. A factor 
analysis of the subtests shows that verbal and non- 
verbal factors account for most of the variance. Con- 
gruent validities with other appropriate tests are 
about .65, and the one reported study of predictive 
validity shows a correlation of scores and subsequent 
ninth-grade achievement to be .67. 

These tests have many merits in their well-planned 
and attractive formats and their numerous conven- 
iences for use. But their greatest strength is the ho- 
mogeneous content and the common standardization 
over so wide a range of developmental levels. They 
will give excellent service for many applications to 
educational administration, and guid- 
ance.—L. F. S. 


PSYC choca 
TEST 


The Grayson tion Teee 


M. Gnaveod 
The Forer Structured Sentence Caom- 


pletion Test (Forms fer Adulte and || 


Adolescents) 


“The HT P Children’s Revision: Past. 
Drawing Interrogation Form... 


WESTERN PSYCHOLOGICAL 


SERVICES 
19655 SANTA MONICA BOULEVARD 
LOS ANGELES 28, CALIFORNIA 


dike 


Announcing 


America’s 


| Psychologists 


A Survey of a Growing 
Profession 


By Kenneth E. Clark 
University of Minnesota 


A report of a study of American 


predominantly psychological 
work? Are recent recipients of the PhD similar to or 
different from those who received the degree 10 or 20 


years ago? Where are psychologists employed? What 


do they read? These are samples uf the questions 
that are discussed and on which substantial amounts 
of factual data are given in the pages of this report. 


FROM. 
The Picture World Test. 
Cuaacorrs Bunuza and Mosse P, Manson 
The Forer Vocational Survey..,....,.. 
Brnrnas R. by the APA Policy and Planning Board 
Cassel Group Aapdration ‘00 supported by the National Science Foundation, 
contary by describing the people who 
The M A C C Behavioral: Adjustment are active in the field, and the nature of their ac- 
Seale.......... Ronenr B. tivities. Some have been outstanding in research pro- 
dectivity. Whal are they like? How do they differ | 
from their less productive colleagues? Are there 
Majer differences among psychologists in, say, ex- 
perimental psychology and those in, say, industrial 
The H T P Bibliography..........).:.. : prychology? To answer such qaestions, Dr. Clark 
V. J. Brewzatemas and his collaborators have studiad the undergraduate 
education, family backgrounds, types of jobs held, 
ed attitudes and values of different groupe of 
HR. hologi How many in the 
Plastic Doll Play 
Hundreds of New Current Books |” 
Teaching Aids 
Price, $1.00 
New 1958-1959 Catalogues Sent tigen * 
American 
Peychological Association 
1333 16th St., Washington 6, D.C. 


inquiries should be adiirened to John M. 
Development, Devereux Schools, Devon, Pennsylvania; 

wemtern vesidents eddrem Keith A. Seaton, Registrar, Devereux 

in California, Sante Barbara, Californie. 


DEVERZUX FOUNDATION 
A organisation Founded 1912 


DEVEREUx, rounder 


or children, The Devertux 
Foundation operates 19 independent residential school units. They 
are located on ‘separate campuses in and around Devon, Pennsyl- 
Maine, Pennsylvania, and California. 
and level of achievement, with the result exch unit is able to tg 
body, yet benefiting by the ceniralized professional services 
For every child, the rsultidisciplined approach, utilizing the com- 
forms the tasks treatehent at Devereux. 
Charles M. Campbell, Jr. M.D, 
| J. Clifford Seow, MD 


